DOCUMENT RESUME 



ED 312 276 



TM 014 017 



TITLE 



INSTITUTION 

PUB DATE 
NOTE 

AVAILABLE FROM 



PUB TYPE 



Sex and Race Differences on Standardized Tests. 
Oversight Hearings before the Subcommittee on Civil 
and Constitutional Rights of the Committee on the 
Judiciary. House of Representatives, One Hundredth 
Congress, First Session. 

Congress of the U.S., Washington, D.C. House 
Committee on the Judiciary. 
23 Apr 89 

309p.; Serial No. 93. Portions contain 
small/semi-legible print. 

Superintendent of Documents, Congressional Sales 
Of ice, U.S. Government Printing Office, Washington, 
DC 20402. 

Legal/Legislative/Regulatory Materials (090) — 
Reports - Research/Technical (143) 



EDRS PRICE MF01/PC13 Plus Postage. 

DESCRIPTORS Academically Gifted; Admission (School); 

Classification; ^College Entrance Examinations; 
Elementary Secondary Education; Equal Education; 
Equal Opportunities (Jobs); Females; Handicap 
Discrimination; Hearings; Higher Education; High Risk 
Students; Minority Groups; Occupational Tests; 
Politics of Education; ^Racial Differences; *Sex 
Differences; ^Standardized Tests; *Test Bias 

IDENTIFIERS Admissions Testing Program; Congress 100th; 

Educational Testing Service; ^Scholastic Aptitude 
Test 



ABSTRACT 

The purpose of this 1-day hearing was to assess the 
level and effects of bias based on gender and race differences 
affecting standardized tests. The focus was on examining the role of 
standardized tests with respect to educational and employment 
opportunities for women and minorities. Testimony or statements from 
14 witnesses are presented. Subjects addressed include the influence 
of test scores on entry into gifted programs, uses of the Scholastic 
Aptitude Test, the impact of standardized testing on children at 
risk, misclassification of minority students, testing of the 
handicapped, politics of testing, the Admissions Testing Program of 
the College Board, test fairness assurances, and the Educational 
Testing Service's sensitivity review process and its standards for 
quality and fairness. (TJH) 



* Reproductions supplied by EDRS are the best that can be made 

* from the original document. 



rrn 



SEX AND RACE DIFFERENCES ON 
<M STAND ARDIZED TESTS 



CO 

g J OVERSIGHT HEARINGS 

BEFORE THE 

SUBCOmTTEE ON 
Cim AND CONSTITUTIONAL RIGHTS 

OF THE 

COmTTEE ON THE JTJDICIAEY 
HOUSE OF REPRESENTATIVES 

ONE HUNDREDTH CONGRESS 
FIRST SESSION 

ON 

SEX AND RACE DIFFERENCES ON STANDARDIZED TESTS 



APRIL 23. I!)S7 



Serial No. 93 



US DEPARTMENT OF EDUCATION 

OtiKe of Educaltor»*i Research and impfOvemeDt 

EDUCATIONAL RESOURCES INfORWATlON 
CENTER (ERtC) 

Cr This documeni has been rePfOOuced as 
received trom the person or orgarttiahort 
originating i| 

r Minor changes have been made to improve 
reproduction quality 

• Points of view o' Opinions stated in this cjocu 
ment do riot necessanty represent official 
OERI position or poltcy 




Printed for the use of the Committee on the Judiciary 



U.S. GOVERNMENT PRINTING OFFICE 
WASHINGTON : 1989 



ERIC 



For sale by the Superintendent of Documents, Congressional Soles Office 
U S Government Printing Office. Washington, DC 20402 



BtSI COPY AVAlLAbLt 



COMMITTKK ON THK JUDICIARY 



I'ETKU W UOOINO. Jk . New Jersey, Chairman 

HAMILTON FISH, .Ik. Now York 
CARLOS J MOOUHEAO, California 
HKNUY J irOK. lUinois 
DAN LUNGREN, California 
K JAMES SENSENBRENNER. Jk . 

Wisconsin 
BILL McCOLLUM, Florida 
E CLAY SHAW, Jk . Florida 
GEORGE W GEKAS. IVnnsyWania 
MICHAEL DeWINE. Ohio 
WILLIAM E OANNEMEYER. California 
PATRICK L SWINDALL, Georgia 
HOWARD COBLE, North Carolina 
D FRENCH SLAUGHTER. Jk, Virginia 
1>AMAR S SMITH. Texas 



JACK BROOKS, Texas 

ROBERT W KASTENMEIER. Wisconsin 

DON EDWARDS. California 

JOHN CONYERS, Jr , Michigan 

ROMANO L. MA2Z0LL Kentucky 

WILLL\M J, HUGHES, New Jersey 

MIKE SYNAR. Oklahoma 

PATRICIA SCHROEDER, Colorado 

DAN GLICKMAN. Kansas 

BARNEY FRANK, Massachusetts 

GEO. W CROCKETT, Jr , Michigan 

CHARLES E SCHUMER, New York 

BRUCE A MORRISON. Connecticut 

EDWARD F. FEIGHAN, Ohio 

LAWRENCE J. SMITH. Florida 

HOWARD L HERMAN, California 

RICK BOUCHER. Virginia 

HARLEY O STAGGERS, Jr., West Virginia 

JOHN BRYANT, Texas 

BENJAMIN L CARDIN, Maryland 

M, Elaine Mikij<k. General Counsel 
Arthur P ENnRts. Jr, Staff Director 
AiAN F COKFKY. Jr.. Associate Counsel 



SuaCOMMITTEE ON CiVIL AND CONSTITUTIONAL RlOIITS 
DON EDWARDS. California. Chairman 



ROBERT W. KASTENMEIER, Wisconsin 
JOHN CONYERS. Jr., Michigan 
PATRICIA SCHROEDER, Colorado 
CHARLIES E SCHUMER, New York 

Catmerink a 



F JAMI':S SENSENBRENNER. Jr 

Wisconsin 
MICHAEL DkWINE, Ohio 
WILLIAM E DANNEMEYER. Cahlornia 

LeRoy. Counsel 



AiJVN SijOBOimn. Assoeiatc Counsel 



CI) 



3 



CONTENTS 



WiTNiCSSES 

Phyllis Rosser, Contributing Editor, Ms. Magazine, Nancy S Cole, Dean, ^ '^"^ 
College of Education, University of Illinois, and Uiana Pull in. Associate 

Dean, College of Education, Michigan State University 2 

Statement of Phyllis Rosser " 6 

Statement of Nancy S. Cole 32 

Statement of Diana Pullin 40 

Gretchen W Rigol, Executive Director, Access Services, the Coilege Board, 
and Carol Anne Dwyer, Executive Director for Test ^"evelopment. School 

and Higher Education Programs, Educational Testing Service 69 

Statement of Gretchen W Rigol 74 

Statement of Carol Anne Dwyer .... . . 151 

Michael C Behnke, Director of Admissions, Massachusetts Institute of fech- 
rology, and Denise Carty-Bennia, Professor of Lau, Northeastern Universi- 
ty, and Executive Chair, Fair Test, Boston, MA . . 281 

Statement of Michael C Behnke . 285 

Statement of Denise Carty-Bennia . . . . 297 

(HI) 



ERLC 



SEX AND RACE DIFFERENCES ON 
STANDARDIZED TESTS 



THURSDAY, APRIL 23, 1987 

House of Rkpresentatives, 
Subcommittee on Civil and Constitutional Rights, 

Committee on the Judiciary, 

Washington. DC. 
pursuant to call, at 9:33 a.m., in room 
l^lb, Rayburn House Office Building, Hon. Don Edwards (chair- 
man 01 the subcommittee) presiding. 

Present: Representatives Edwards, Schroeder, and Sensenbren- 
ner. 

Staff present: Catherine LeRoy, chief counsel; Alan Slobodin, 
associate counsel; Barbara Dobynes-Ward, clerical staff. 
Mr. Edwards. The subcommittee will come to order. 
The gentleman from Wisconsin. 

.u^^:i.^^^t^^^^^^^^^- Chairman, I ask unanimous consent 
that the subcommittee permit coverage of this hearing, in whole or 
in part, by television broadcast, radio broadcast or still photogra- 
phy, m accordance with Committee Rule 5. 

Mr. Edwards. Without objection, so ordered. 

The purpose of today's hearing is to examine the role of a variety 
ot standardized tests with respect to educational and employment 
opportunities for women and minorities. 

Americans, especially students, are forced to take an increasing 
number of standardized tests. These tests are used for purposes of 
school admittance, placement and graduation. Because decisions af- 
tectmg educational opportunities and employment opportunities 
are based on these test results, we need to know that educational 
and vocational tests are, in fact, valid measurements of ability. 

1 he courts in California recently banned the administration of 
any [(^ test to black students when it was found that the tests were 
biased. When a test scores students on the basis of their race and 
not their ability, then clearly the test should not be used. Tests 
that measure culture in the name of ability deny students and 
worlzers equal access to employment and educational opportunities. 
Un the basis of public policy and simple fairness, we need to know 
where test biases exist and what steps can be taken so that they 
can be ehminated. 

Our witnesses will be appearing on three panels. I recognize the 
gentleman from Wisconsin, Mr. Sensenbrenner. 

Mr. Sensenbrenner. Mr, Chairman, the minority has no opening 
statement this morning. & 

(1) 



ERLC 



2 



Mr. Edwards. Thank you, Mr. Sensenbrenner. 

The members of our first panel are Ms. Phyllis Rosser, Contrib- 
uting Editor, Ms. Magazine; Dr. Diana Pullin, Associate Dean, Col- 
lege of Education, Michigan State University, Lansing; and Dr. 
Nancy Cole, Dean, College of Education, University of Illinois, 
Champaign-Urbana, IL. If the members v^ill come to the witness 
table, please, we will start with the two who are here. 

Would you raise your right hands, please. Do you solemnly swear 
or affirm that the testimony you are about to give is the truth, the 
whole truth, and nothing but the truth? 

Ms. Rosser. I do. 

Dr. Cole. I do. 

Mr. Edwards. Welcome. Without objection, all the statements 
will be made part of the record. I believe that Phyllis Rosser is 
first. Ms. Rosser, as I said, is a contributing editor to Ms. Magazine. 

STATEMENTS OF PHYLLIS ROSSER, CONTRIBUTING EDITOR, MS. 
MAGAZINE; NANCY S. COLE, DEAN, COLLEGE OF EDUCATION, 
UNIVERSITY OF ILLINOIS; AND DIANA PULLIN, ASSOCIATE 
DEAN, COLLEGE OF EDUCATION, MICHIGAN STATE UNIVERSL 
TY 

Ms. Rosser. Thank you. I am glad to be here and pleased that 
the subcommittee is focusing attention on this important issue. 

My name is Phyllis Rosser and I'm a consultant on sex bias in 
testing. As a contributing editor to Ms. magazine for the past 14 
years, I have had articles on education and testing published in Ms, 
and other magazines as well. I began researching sex bias in test- 
ing in 1979 and wrote a report for Ms. at that time on the aptitude 
tests used for college and graduate school admissions, on standard- 
ized achievement tests that are given from kindergarten through 
the 12th grade for tracking, on IQ tests that are administered by 
psychologists, and on interest inventories that are used for career 
guidance in high school. 

Most recently, I have been working with the National Center for 
Fair and Open Testing and am principal author of Sex Bias in Col- 
lege Admissions Tests: Why Women Lose Out. 

The tests where sex bias seems to have the greatest impact on 
girls' educational opportunities are the college admissions exams. 
The Scholastic Aptitude Test and the Preliminary Scholastic Apti- 
tude Test/National Merit Qualifying Test, published by Education- 
al Testing Service and the American College Testing Program's As- 
sessment ACT, are taken by over three million students each year. 
They are systematically underpredicting the abilities of high school 
girls. Although females have higher grades than males in all sub- 
jects in high school and higher college grades, even the freshman 
year, senior high school girls averaged 61 points lower than boys on 
the SAT last year, 50 points lower on the math section, and 11 
points lower on the verbal. This is an area where girls excelled 
until 1972. Then boys began to outscore them and the scope gap 
has gradually widened. 

ETS's justification for the use of this test is that it predicts fresh- 
man year college grades, but it is not doing that for girls. They are 



ERLC 



52 percent of the 1 o million test takers, so this means that scores 
are being underpredicted or approximately 800,000 females every 
year, in fact, if these tests were accurately predicting freshman 
grades, girls would score 20 points higher than boys, rather than 61 
points lower. 

I am sure, if boys were receiving higher grades and lower test 
scores, the tests would be rewritten. 

Minority females are doubly penalized by the test. They all score 
ower than tne males in their ethnic group who, in turn, score 
ower than white males. In 1985, black women scored 43 points 
lower than black men, and 264 points lower than white men 

A similar pattern of test bias can be found on ETS's Preliminary 
Scholastic Aptitude Test/National Merit Qualifying Test, taken by 
1.1 million juniors last year, 54 percent of whom were female. 
Crirls score averages were 53 points lower, in SAT terms, than 
r \" the math, and 12 points in the verbal. To qualify 

for the National Merit Scholarship, verbal scores are doubled and 
tne math is added, in order to give girls more of a chance. But dou- 
bling their lower verbal scores now works against them 

Girls are also scoring lower on the ACT Assessment, taken annu- 
ally by nearly a million high school seniors, and 54 percent of them 
are also tema es. Last year, girls scored lower than boys in math 
usage, natural science and social studies, but slightly higher in 
bnglish usage, averaging six score units lower than boys on the 
test overall. 

Sex bias on these tests is having a much greater impact on fe- 
males than we realize. By underpredicting their academic perform- 
ance, these tests affect girls chances to gain entrance to nearly 
1,5UU colleges and universities that require SAT scores or use SAT 
cut-oft scores for admission. 

hfrpH qTt ^''riHniyf''^y of Texas at Austin requires a com- 
i-f^^-^^°!:®Tf^^'^,^^ f"'" out-of state applicants. The University 
01 v^aLfornia at Berkeley adds the SAT and three achievement test 
scores— also tests where the girls score lower— to the student's 
grade point average, which is multiplied by 1,000 in order to rank 
candidates for admission. 

Unfairly low test scores also become a self-fulfilling prophecy 
causing girls to lower their expectations and apply to less com peti- 
mitI ^'■^^^^ suggest. This is truly unfortunate. 

Mil has been accepting girls with lower SAT math scores and has 
found they are doing just as well as boys in freshman math classes. 

High school girls are also being denied the opportunity to take 
academic enrichment programs and accelerated courses offered to 
students with high test scores. A number of summer programs are 
offered publicly by States or privately by Ivy League and other 
competitive schools and by well-known prep schools. 

Use of these tests also means less scholarship money for female 
college students. Merit scholarships awarded by hundreds of corpo- 
rations, foundations. Government agencies, professional organiza- 
tions and unions each year are partially based on ACT, PSAT, or 
bAl scores. Most of these organizations refuse to provide a gender 
or racial breakdown of recipients. However, the National Merit 
bcholarship Corporation, which offers the most prestigious awards 
tor academic excellence, publishes this data. Over $23 million pro- 



4 



vided by 670 corporations, foundations, professional organizations, 
colleges and universities, is given annually by National Merit to 
students with the highest scores on the PSAT. Last year, girls' 
qualifying scores averaged 65 points lov^^er than boys, in SAT 
terms, and they received only 36 percent of the 6,026 available 
scholarships, while boys received 64 percent. This year, the semi- 
finalists pool, based solely on PSAT scores from which the winners 
will be chosen, is 34.7 percent female, 61 percent male, and the sex 
of 4.3 percent is unknown. 

In the escalating competition for top students, merit scholarships 
are being increasingly used for recruitment. Students with high 
scores on the SAT or the PSAT receive letters offering honor schol- 
arships from a large number of colleges and universities, which 
buy their scores and other student data from ETS. 

The final result of all of this is a real dollar loss for females in 
later life, as they get less prestigious jobs, earn less money, and 
have fewer leadership opportunities. Of course, the life-long loss of 
self-confidence can't be measured in financial terms. 

At present, reseaichers cannot easily tell which questions are 
biased by examining the tests. Only the test publishers know which 
questions females and minorities answer incorrectly, and they have 
not made this information easily available. But there are some 
theories about the gender gap, particularly on the SAT. 

ETS President Gregory Anrig says that a larger pool of test 
takers will have bwer scores. ETS also says that more girls than 
boys from lower income families take the test. They also have 
lower test scores which reduces the female average. However, de- 
spite their larger pool and lower incomes, the girls who took the 
SAT in 1985, according to College Board data, had higher grades 
than the boys who took it. This test didn't reflect their perform- 
ance in the classroom. ETS says girls take less math and science in 
high school, but College Board data for 1985 shows that girls who 
take the test are almost as likely as boys to have taken four years 
of math. 

The College Board says men take harder courses in college, but 
their own validity studies show girls college grades in math and en- 
gineering tend to be underpredicted by their SAT scores. 

Most insidious of all are those who say girls' grades reflect good 
classroom behavior rather than high intelligence. Of course, grades 
include much ore than can ever be measured on a multiple-choice 
test, such as the ability to think complexly, solve problems, orga- 
nize information and express oneself clearly. It is generally ac- 
knowledged that girls write better, and the writing tests bear this 
out. 

I have looked at SAT questions over the years and find them of- 
fensive in their consistent male orientation. I recently analped 24 
reading comprehension passages that appeared on four SATs given 
in the 1984-85 year. I found references to 42 men and three women 
in the 24 passages. Thirty-four of these men were famous and their 
work was cited. One famous woman, Margaret Mead, was men- 
tioned, and her work was criticized. 

David White, a lawyer from California, who has done consider- 
able research on college admission exams, has found a number of 
questions that are demeaning and emotionally loaded for women 



pJbliSd hv^FT?"' '^T'^^^u^l If' ''^^^^ admissions test, 
published by ETS, concludes that "children should be raised only 

tYml^Tbv'^iS"'?!^' l^'"^'^ I' '^'^ centers and ?u K 
rlTn.n^Trr "^^rtainly women who take this test are roing to 
respond differently to this language than men. It may slow theS 
down and even shake their confidence for a while. 

ii'lb could change these tests to make them fairer but has 
assumotTonThat ^tanford-Binet IQ test is wVXn with fhl 

assumption that the sexes are equally intelligent, and it is revised 
periodically to keep them equal. ETS receives $17,250,000 for the 
set fa^r.""^ '° '"'"^ ^^^'''^ ^° ^'^^"^^ it, to make i? 

thfZTnZ^T'i' '"^.'^^tes that other tests are also biased, such as 

and the Ar-'-^'H "'"v'^V^"^^ high school tracking 

and the Armed Services Vocational Aptitude Battery, widely used 
for career guidance in high schools. ^ 
- would like the Congress to request that the Department of Edu- 

fsee iSTeHvf ''k'^L'^^ '^^^•"^ '"^j"^ ™p'^ts on students, 
to see It they predict what they are supposed to. In order to do this 
fairly and accurately, I think it is essential that the researcheJs 
who^^eceive these contracts are not connected with the tes^pS- 

wiih ^r^! additional supporting material that I would like to include 

Mr EDWAtn"^ w'';if"^. P«™i«sion to do that. 

Mr. bDWARDs. Without objection, so ordered. 
Ms. RossER. Thank you. 

[The statement of Phyllis Rosser, with attachments, follow:] 



ERIC 



6 



TISTIMOKY OF PHYLLIS RCSSSR 



TO rag HOUSE 



JUDICURT CO^tilTTEE ON ClVlL AND COHSTITVTIONAL RIGHTS 



April 23, 1987 



1 hive been t ConCrlbutl^ Kdltor Co Miij_ Magt<lne for the pMt fourteen 
7e«rt and I've htd mmuj attlclet on educmtloo and learning publlahed In other 
aagatloea at well. I began reaearchlng aex bias In teatlng for ?ta. In 1979, 
with an opan mind. Teata had nevar kept ae froa anything 1 wanted to do. I 
doo't aren reaember the SAT scorea 1 receded In 19SJ. 

1 exAMloed the teata, read testing atudlea and Interviewed the teatlng 
reaearchara wtio had written the«. 1 wrote a report for Ma^, In 1980 on 
Aptitude Teata that are uaed for college nai graduate achool adalaalooa. 
Standardized Achleveaent Teata given fro* kindergarten through I 2th grade, 
l.Q. teats admlnlatared by psycho loglats. and Intereat Inventorlea uaed for 
Career Guidance In high achool. 

I mu very pleaaed that Coogreaa la Intereated In the effects 
8tand/»rdlted testa are having on fcaalea and aorry to report that the teata 
have not Improved auch alnce I began a^ research. In fact, on the college 
entrance exaidoatlons, the score ^jap between the aexaa haa widened. 

What atnick m first when I looked at thcae teata ma the overwhelming 
number of uita that populated them - all of who« w»re engaged in traditional 
occupatlona like doctor and l«»yer while wovn were teachera, nuraea and 
aecretarles. .According to recent research, there are still twice as many aen 
as woaen on aost tests and they are still shown In stereotyped roles, even 
though this doesn't repreacnt thj world of 1987 at all. Studlea done by 
Educational Tasting Service researcher* as far back as 1979 ("Sex Differences 
and Sax Mas In Test Content" by Ekstro«, Lockheed, Donlon, educational 
Hon tons) show that 'feaalas tend to do better on iteas that hsve aore feaale 
or neutral figures than on itew In which there are Hie figures." This 
■eana that aale-^rlented content Is not only offensive. It la also a aource 
of bias. 

5ut the tests where sex bla« saeas to have the greatest l^act on girls' 
educational opportunities are the college entrance exaalnatlona. The 
Scholaatlc Aptitude Test (SAT) and the Prellalnary Scholastic Aptitude 
Test/Satlonal Herlt Qualifying Teat (PSAT/»IQT) published by Educational 
Teatlng Service and the Aaerlcan College Testing Prograa'a ACT Aaaessaent 
(ACT) are systeaatically tmde rpred 1 ct Ing the abilities of high school girls. 
Although feaalea have higher gr*des In every subject In high school snd 
higher college grades, they receive lower test score* on the SAT and the ACT. 

■,i^f^J^' coapoaed of two section*. Verbal and Math, rach acored on a 

200-800 point sctle. The aaxHasa poaslble score Is 1600. Ust year, woacn's 
average SAT score* were 61 point* lower than aen's - SO points on the Math 
section and 11 points on the Verbal section - an area whete glrla excelled 
until 1972. Then boys begsn to outscore thta verbally ss well sa 
aatheaatlcally (boys have always received higher KkCh scores on this test), 
and the scote gap haa gradually widened. 



rhl* growing acore gap Is surprising since ETS saya the aaln purpose of 
e ^ la to predict freshaan year grades but It's not doing that for 




ERIC 




7 



ra sure, If boys werrt receiving higher grades and lower test scores, 
the tests would be rewritten. 

Minority wooen are doubly-pcnalUcd by the test. They all score lower 

than the aen In their ethnic group, who. In turn, score lower than white oen. 

In 1985. black wooen scored 43 points lower than black oen and 264 points 
lower than white aen. 

A slallar pattern of test bias can be found on ETS' Prellalnary 
scholastic Aptitude Test/ National Herlt Qualifying Test (PSAT/NMQT), taken 
by 1.1 taillon Junior high school students last year (who wrre 5uZ feoale). 
ETS promotes this as a practice test for the SAT, but the National »ierlt 
Scholarship Corporation awards o«r $23 ailllon In student scholarships to 
the highest scorers on this test. 

f J^ia"" PSAT/NMSQT ha. two parts. Each Is scored on a scale 

?" ?\ TestEwkers clala aa approximation of future SAT scores can be 
obtained by «iltlplylng PSAT/NMSQT scores by ten. In 1985-86, girl's score 
H^'^Vy ^3 points lover. In SAT ter«, than boys: 4i points in the 

Math 12 points In the Verbal. To qualify for the National Merit 
Scholarship, verbal scores are doubled and the aath Is added - in order to 
give girls oore of « chance. But their lower verbal scores which arc 
doubled, are now working against thea. 

An alternative college entrance exao to the SAT Is the ACT Assessoent, a 
survey achlevevKint test taken annually by nearly a adlllon high school 
seniors (5« of whoo are female), mainly in the Mid-West, Southwest and 
--^-^h. The ACT has four sections: English Usage, Mathematics Usage ,> Natural 

l"\r\ Vf'"'- °" » that'ringes froa 

Lit Slrls averaged 2.8 score units lower than boys in Math 

Usage. 2.5 units lower In Natural Science, and 1.7 units lower In Social 
Studies but slightly higher (l.O units) In English Usage, averaging 6 units 
lower than boys on the test, overall. 

ww^i^i'/^'^.""^"^ SCO"" on eaost of the Achievement Tests 

ullJii'.? . ' "''^'J adalsslon to some colleges and 

Seilo s trV. '5 Board's Profiles of Col le«e-Bound 

^£nl2rs, 1985, girls scored nine points higher on English Co^osltlon and 
Literature, one point higher on Geroan but lower on all the other tests. 

'I;"^ having a auch greater Impact on females than 

Iff;ct i?ru H ""'^"^"^^"^"S their academic performance, these tests 
reaulJe JIt J?5 " ^ " entrance to colleges and universities that 
luo ? K ' °' " admission. 

also oa-kedly diminish their chances to obtain «rlt scholarships based on 
test scores, and to enter many special educational programs for gifted hlgl 
.chool students tha- use SAT scores In their admissions criteria! ^ 

I^^t Scores Effect College Admission 

ziv.'ivirri^i iriiui ^o-''- > for=":.= le" r ^ 



Thev 




8 



3 



Che Si.T score, the scores on chree ETS AchlevCBvnc Tescs (where n^irls also 
receive lover scores) and ch^ Grade Polnc Average atldpUcd by 1CX}0, co rank 
candidates for admission. 

Although «oae colleges vuy not actually use scores In the selection 
process, they often publish the average SAT scores of their previous freshaen 
class to establish high academic credential. A* a result, vosen with lover 
SAT scoreu vlll lever their expectations and apply co less coopetltlve 
schools than their grst'.et suggest. Ernest Boyer recently reported In 
Col lege; The Und e r gr adiia r < Experience In America tn^t 621 of the students 
questioned said they lowered their college expectations after receiving their 
SAT scores. 

Low Tpst Scores Re duce Entry Into 'Gifted' Prograos 

A large number of academic e^rlchaent prograon are offerfrd to students 
with high SAT or PSAT scores. Fewer of these opportunltle'^ are otfered to 
t^males, due co their lover scores. This laeans they nnl only lose the 
opportunity to enhance or accelerate their high sch''/Ol prograa, but also hjvc 
less Ispresslve resumes of extracurricular ^cade'^ic activities co present on 
College applications. 

In Nev Jersey, outstanding honors students In science and political 
science w.' th high SAT scores are Itvlted to attend the Governor's School, a 
suooer enrx'*haent program held j^n college csopuses. 65Z of the attendees at 
the science school this sumur vlll be aale, 35Z vlll be fcnale, froo a pool 
of applicants that vas 71>^ srle* High PSAT scores and high grade point 
averages ore 4lao used co selec' one scudcnc froo each high school In Ncv 
Jersey co atter.d the Nev Jersey Schoitra Program held at the Lavrcncevlllc 
School each sumier. 

In Washlngtoi, D,C.« students vlth high SAT Hath scores are offered 
opportunities co lake advanced aach courses on college caspuscii during k he 
sumaer. Addldonally, high scoring scudencs vhose parencs can afford sussier 
school culdon havt a smorgasbord of opporcunldcs Co develop chclr 
glfcedncss. Summer enrlchmenc courses are offered by Ivy League and other 
coopetltlve schools, and by vell-known prep schools. This summer, the George 
School in Newton, P4>nnsy 1 vanla and Blair Academy In Blalrstovn> Ncv Jersey 
vlll offer Courses n advanced mathematics, college science, cooputer 
science, languages, literature, the arts and« Ironical Xy, PSAT and SAT 
coaching. 

Johns Hopkins University's Center for th« Advancement of Academically 
Talented Youth (CTf) Invited ?6,876 seventh grade boys and girls In 19 <itates 
to take the SAT, co dcccrmlnc If chey were Bathcmatlcally or verbally 
calenled. Junior High School scudencs qualify for this by scoring in the 
upper 31 on chc mathematics section of a national standardized ach^evecu^nt 
test. Those who score 500 or more on the Verbal or Math section are Invited 
to i»ttend one of their f Ivo camps for "gifted and calenced" students. 

This suoaer. Invitations co the Johns Hopkins program vlll be extended 
to over 2,500 boys but only 1,081 girls. Although an equal nuobcr of boys 
and glrU take the test, girl's lover SAT scores keep thea froa qualifying 
for these high-powered sumner programs. They aay also suffer a blov to tnelr 
self esteem and lower their expectations about future SAT perfontince - 
before chey even ro-ich high school. 




ERIC 



9 



Low Teat Scores Dtfny Mf r 1 1 Scholarship Honey 

Use of exaa score* alao aean* l#ss aerlt scholarship ooney for feaalc 
college students. Merit acholarshlpa awarded by hundreds of corporations, 
foundations, governa^nt agencies, professional organUattona jrnl unlona each 
year are p.rtUlly b«sed n ACT, SAT or PSAT scores. Host of these 
organization* refuse to p.ovlde « gender or racial breakdown of scholarship 
recipients. However, the National Kerlt scholarship Corporation, which 
offers the aoat prestigious awarda for acadealc excellence, publishes this 
data. 

Over 23 Dillon doUars, provided by 670 corporations, foundations, 
colleges and unlversltlea are given annually to atudenrs with the highest 
PSAT scores. Last year glrU qualifying scores averaged 65 polnta lower than 
boys (In SAT terw) and they received only 36Z of the 6,026 achoUr««hlps 
awarded while boys received 64Z, This year the seol-f Inallst pool (baaed 
solely on PSAT scorea) froo which the winners will be chosen has 15,507 
studenta, J4,7Z .re feoale and 61Z are male ^(the sex of 4. 31 la unknown), 
(see Scate-83r-State breakdown, p.lo) 

Senl-f*nallst 8taru9 Is given to atudenta vhoae PSAT scoref J twice 
Verbal ano Math acore) rank thea In the top half of IX In each state. In 
order to obtain scholarahlp aoney, seal-f Inallsts subalt Inf oration about 
their acadealc records, extracurrlcuUr .ctlvltles, leadership potential and 
Intended college aajor, along with their principal's recoaaendatlon to the 
National Merit Corporation's sel-^ctlon coaalttee. Students suat also 
dupllcati* their high PSAT acore with "an equlvale-u high Scholastic Aptitude 
Teat perfortaance," iccordlng to their Pro^raa Culc"*, This also works against 
lower-scoring females. In 1985-86, of the 13,777 Merit Finalists, 6i.lZ were 
male and 35. 9Z were feaale. 43. 7Z of the flnalUts actually receive Merit 
Scholarshlpa. 

An alarming trend for vooen Is evident tn the National Merit 
Corporation's Annual Reporta. Although the total number of scholarships 
awarded annually has Increased, the number and percentage of feaale 
recipients haa decreased notlcably In the Uat three years. in 1983-84 
National Merit Scholara were 40. 2Z feaale, in 1984-85, 37. 9Z were feaale, in 
85-86, 36Z were feaale. 

It is laposslble to calculate exactly how aanv allUons of dollars girls 
lost in this uneven »pllt because Merit Scholarahlpa are awarded in three 
catagorlca. National Merit Corporation awards 1,800 of it's own $2,000 
scholarshlpa annually. In addition. It adoinlsters the awarding of' 
scholarshlpa for '125 corporation^ and 2,800 collegea and universities In 
aaountsi ranging froa $250 to $8,000 pel year. 

The National Merit Corporation also adainlaters the awarding of 1 179 
"Special- corporate scholarshlpa worth $7.6 million. These scholarships are 
awarded to studenta with scores below the finalUt level who are interested 
in a careor the grsntor wants to encourage, or who live in a coaounUy where 
the coopany haa offices, (aee Appendix II for list of corporate and business 
sponaors of special oerit scholarahlpa in 1986) 

are aiJ!rrfl!r^«^irr'/"/' j^chola rshipa , worth over $40 aillion annuallv. 
Excellence <,»,r<ied «re to boy,. „hll. only 270 went to girls. The gender of 



ERIC 



10 



5 

58 winners could not be deCerslned by nano. 

Mjile* also von aore of New Yoric State's 25,000 Reji^cnCs College 
Scholarships, which are exclusively deteralned by SAT or ACT scores, and 
worth up to SI, 250 each. Of the 109,266 students who coopeted for the 
scholarships, 47Z w«re sale and 5?I were fe»ale. Uowever, 5^1 of the 25,277 
winners were Mle and 43Z were fewsle. 

Once FslrTest sod NYPIRG aade this discrepancy public, the New York 
State Board of Regents aoved swiftly. Acknowledging that woacn'^ lower SAT 
scores kept thea froa receiving their fair share of aerlt scholarshlpft, the 
Regents voted unanlaouly to ask the legislature for funds to develop a new, 
unbiased tests. 

Other states use a coMbloatlon of grades and test scores for their oerlt 
progrsM with sore e<iultable reaults, Hew Jersey requires students to have 
SAT scores of 1200 or aore snd slso rsnk In the top IQZ of their high school 
class to qualify for Garden State Dlstlnqulshcd Scholarships, Up to $4»000 
Is awSkded annually to 800 atudents (at least 2 froa each school) for a totsl 
of $3,200^000 to encourage thea to attend colleges In New Jersey. Last 
year's Garden State Dlatlngulshed Scholars were 50Z feaale and 50Z were itale. 

A cosputer print-out froa a typical northeastern high school guidance 
office lists 13& scholarships tied to test scores. These "aerit" 
scholarships are given by unlon«» fraternal orgsnlzatlons, religious 
denoalnatlons» corporations (aainly sponsoring children of eaployees), 
professional organizations, and the allltary. Host of these scholarships are 
awarded to atudents with high test scores In coablnation with high grades, an 
Interest In pursuing s particular couae of study and/or financial need. 
Engineering societies predoalnate, giving aore career-based aerlt 
scholarships thsn sny other group, (see Appendix III for a partial listing of 
private scholarships based on SAT and ACT scores) 

In the escslstlng coapetltion for top students, aerlt scholarships are 
being Increasingly used for recrultaent* according to a 1984 study, aore than 
S5Z of four-year private colleges and nearly 90Z of public Institutions offer 
no-meed scholarships for acadeaic excellence, and substantially aore of these 
are being offered now than even five years ago. In private, four-year 
colleges, &AZ of this no-need «oney Is taken froa tuition and fee lncoac» 
raising laportant queations about the aplrallng costs of college tuition. 

Last year» one Nev Jersey atudent who received a $4,000 Garden State 
Dlstlnqulshed Scholarship, found his aallbox full of additional scholarship 
offers. Thirteen New Jcraey collegea offered hla grant a ranging froa $2,000 
to $12,000. Drew University in Madison. N.J. also told hla that It offers 
$4d»000 to students who score 1350 or better on the SAT and $32,000 to 
students with 1300 SAT's. 

Two out-of-state colleges offered this atuden" "honors" scholarships 
outright, ranging froa $500 to $10,000. Sixteen oi her collegea and 
universities told hla he qualified for their nerlt scholarshlpa. aoac of 
j^hlch covered full tuition. In addition, eight unlveraltles ~ Including the 
Universities of Michigan^ Indl9'^^, and Delaware - offered hla adolsslon to 
their Honors Prograas In which a saall» telect group of acadealcal ly-talented 
studenta attend a saaller, select college within the university. They are 
given enriched acadeaic prograas, honors grants, and live together In a 
separate realdence hall. 



11 



The final result of lower test scores U a real dollar loss for fetaales 
In later life as they ^et less prestigious jobe, earn less aoney, and have 
fewer leadership opportunities. Of course, the life-long loss of 
5elf-conf Idence can't be aeasured In financial teres. 

Why The Gender Cap? 

It Is iBposslbU to tell which (Questions are biased by exaalnlng tne 
tests. Only the test publishers know which question, feauiles and alnorltles 
answer incorrectly and they have not tMde this Inforaatlon easUv available. 
A bin is currently i^ovlng through the New York State Legislature which would 
require publishers to provide a gender and racial analysis of test questions 
for an entire year. 

In the aeantlae, there are some theories about the gender gap. 
particularly on the SAT. 

ETS President Gregory Anrlg says that a larger pool of test takers will 
have lower scores. ETS also says chat the larger pool of girls Includes oore 
girls froa lower income faollcs who have lower test scores which In turn 
reduces the average feoale scores. However, the glrU who took the SAT In 
1985, according to the College Board's Profiles of College-Bound Seniors, had 
higher grades than the boys who took It, despite their larger pool and lower 
Incones . 

. Director of Public Affairs for The College Board, 

!r%n^ , ^" ^^8h school than boys," to explain 

! ^Ll . ^"'^ °" ""'^ section. However, the College Board's Profiles 

for 1985 shows that girls who take the test are alaost as likely as bST^ 

(SO.S: vs. 57.61) to have taken four years of twth. 

^^'^^ ^'^^ ^"'^^ courses In college. They are less 
likely to be taking science and engineering where grades are lower because 
the courses are hard^i. But The College Board's own validity studies show 
.hat woaen who oajor In engineering and aath In college tend to receive 
higher grades than their SAT scores had predicted. Massachusetts Institute 
tLv^'^o"? admitting women with lower SAT Math scores and finds 

they do Just as well as aen In freshaan oath classes. 

Th.v t*^' J*"? against feaales In society. 

5f J ^ ' * differently In the classroora which Ly 

liiltl /'^ ^'^y P^^fo"* °" standardized tests. Although the society i, 
biased against females and the classroom reflects that, glrU are able to 
overcome thU handicap and earn better grades, even though they receive less 
classrooa attention. 

Most Insidious of all are those who say girls' grades reflect good 
classroom behsvlor rather than high Intelligence. As we all Jcnow, grades 

iTiUtl T. ^JTl" T ^ """""^ °" ^ s"ltlple-cholce test. ,uch as the 

ability to think complexly, solve problems, organize Information and exoress 

t.^r'.Sn-j.ieTin ^j^n]^ v\i:::^:^z^^^ ^^^^ - 

ahfrt art six r«dlng p„„go, „„ SAT.) I found rtftrencts to H^n 



12 



7 



«nd chree women In cbe 2A passages. 34 of chcse teen were faaous and chelr 
work was deed. One woaen, Kargarec Mead, was faaoua and her work was 
crldclzcd. 

David Uhlce, a lavjer froa California, has done connlderable research on 
Che graduate encrance exass published by ETS - che Law School Admissions Tesc 
(LSAT), che Graduace Record Examlaadon (GRE) and The Graduace Kanageoenc - 
Adalsslona Tetc (CHAT) - and found a nuaber of quesdona chac are 
•aodonally-loaded and offcnalve for vonen and blacks. For exaaple, one 
quesdon on che LSAT coocludes chat "chlldreo (should) be raised only by 
chelr Dochert, aDd...noc be f^raed ouc co day-care cencera and full-dae 
babyslccers. " Cercalnly the aocherfl who are caking chla ccac arc going to be 
"farolng ouc chelr children." 

David Uhlce and I boch feel this cype of deaeanlng quesdon slows down 
cest cakeis and nay even shake chelr confidence for « while on a cesc chac 
requires che ucaost In speed and rlak-caklng. Ac che very lease, chey case 
doubt on ETS's Sensldvlty Revlev. 

ETS could change cheae cesca to sake rhea fairer for girls buc has 
chosen noc co do chla. The wldelr-^sed SCanford-blnec I.Q. Tesc Is wrlccen 
wlch che assuapdon chac the sexes are equally IncelUgenc. Ic Is 
periodically revised to sake sure the sexes score equally well. 

I vould like to ask ETS wby It has decided thac boys are saarter chan 
girls? I would also like to know vhac che SAT Is predicting, If It's noc 
4resha3n grades? ETS receives $17,250,000 for this cest every year chac 
doesn't do what Ic'a supposed co for over half cbe people caklrS Ic. I chink 
chac Is consuaer fraud. 

I also chink unfair college adalsslons cest* aay be che dp of che 
Iceberg. Recenc research 'ndlcaces chat other tcacs are also biased agalnsc 
girls, like the scandardlzed achleveaenc teacs used for high school cracking 
and che Araed Services Vocatloal Aptlcude Baccery (ASVAB), che oosc 
widely-used ipdtude cest for career guidance In high achools. 

I would like Congress to requesc chat che Deparcaenc of Educadon 
Inveadgace cescs that are having ttajor lapaccs on scudcncs -* co see If chey 
predlcc what chey are supposed co. In order co do chls fairly and 
accuracely, I chink U Is cssendal chac the researchers who receive chese 
concraccs are noc connecced wlch the cest publishers. 



The scadsdcs, charts and aoae of che Infocaatlon presenced here were flrsc 
published In che Nadonal Cencer for Fair and Open Testing Reporc on Se x Bi as 
^" Col lege Adnloslons Tests: Why Women Lose Ouc by Phyl 1 Is Rosser wlch che 
^Taff of Vacional Cencer Tor Fair and Open Testing, April 1987. 




ERIC 



13 



8 



s 
c 

0 
R 
E 
S 



SAT Scoro Averages for College-Sound Sen 



lors, 1972-86 




1972 1973 1974 15751976 197719781979 1980 1981 19821983 1984 1 
YEARS 

O ^WLE TOTAL S FEMALE TOTAL 



98S1986 



Average SAT Scores 

^ ^ , Females Males Diff 

Asian. Pacific Amcncara 897 946 

^ . 705 74« 

Mexican- Amcncan 775 345 ^n 

790 855 



NanvcAmmcanj 790 855 -65 

PucnoRjcan 744 820 .76 



White 9^ 
NationaJ Avenge 877 938 

-from 1985 Pro«lM. CoSega Bound Sertors. 
^ Leonara Rsntst antf Solomon Aitem. CBBB ms 



'57 

•61 



ERLC 



14 



1986 National Merit Scholarship Semi-Finatists 




National Merit Scholarship Winners 
(over last three years by gender) 

4000 I 

3700. _ I I 

3400. 1 I 

3100. 
2800. 
2500. 




Tcul 5.858 Total 6021 Total 6026 

1983-84 1984-85 1985-86 

□ Male DFomale 



ERIC 



18 



15 



lO 

Nat lo Dal Merit Sejil tlnaUst s for I9862§7 
kZ State Ereakdowu Gender 



STATE. # 


GIRLS: 


^ GIRLS: 


# BOYS: 


2B0YS: 


# UNKNOWN. 


TOTAL 


Al abaas 


82 


36 


.41 


140 


62 


.11 


3 


225 


Alaska 


12 


44Z 


13 


48 




2 


27 


Ar IzoQa 


55 


33 


.71 


99 


60 


. 72 


9 


163 


Arkansas 


47 


29 


.31 


1 10 


68 


.82 


3 


160 


California 


495 


35 


. 4Z 


817 


58 


.3J 


88 


1.400 


Co lorado 


70 


35Z 


120 


60 


.62 


8 


198 


Connec t Icut 


81 


33 


.32 


li2 


62 


.52 


10 


243 


De levare 


10 


23 


.21 


30 


69 


.82 


3 


43 


D.C . 


25 


38 


.51 


39 


602 


1 


65 


Florida 


176 


32 


.92 


326 


61 




32 


534 


Georgia 


123 


36 


12 


217 


642 


1 


34 1 


Hawaii 


35 


46 


72 


34 


452 


6 


75 


Idaho 


13 


20 


62 


49 


77 


82 


1 


63 


11 llnols 


258 


33 


82 


472 


61 


92 


33 


763 


Indiana 


134 


34 


42 


247 


63 


32 


9 


390 


Iowa 


85 


35 


62 


148 


622 


6 


239 


Kansas 


64 


39. 


72 


86 


53. 


42 


1 1 


161 


KenC ucky 


75 


31. 


92 


153 


65. 


12 


/ 


235 


Louisiana 


87 


32. 


42 


168 


62. 


72 


1 3 


268 


Ha Ine 


28 


32. 


12 


56 


64. 


42 


3 


87 


Maryland 


U6 


33. 


92 


206 


60. 


22 


20 


342 


Massachusett s 


178 


35. 


32 


308 


61 . 


22 


1 7 


503 


Michigan 


218 


32. 


42 


426 


63. 


42 


28 


672 


Ml nnesot a 


125 


37. 


1 2 


203 


60. 


22 


9 


337 



ERIC 



16 



National Merit 


Seal- 


Finalists 


1986 


( cont inued) 








State <^ GIRLS- 


• GIRLS 




BOYS 


XBCiS 






rCTAL 






29.5 




102 


68.4: 




■J 




Ml ssourl 


131 


37.8 




192 


55.3: 




24 


347 


Mont ana 


32 


51.6 




26 


41 .9: 






6: 


Ne br aska 


39 


3 0.1 


• 


80 


62: 




10 


120 




1 9 


3 8 * 




30 


6o: 




1 


50 


iMc<* nsspsniic 


45 


35.^ 


♦ * 


79 


62.: 


: 


3 


127 




17 0 


3 3* 




327 


63.5: 


1 8 


515 


He*' Mc X I c 0 


24 






65 


72. 


2: 


1 


90 


new lorK 


373 


3 2 S 




730 


62.7: 


6 2 


1,165 




155 


3 9 . 


2 Z 


232 


58. 


7: 


8 


395 


North Dakota 


20 






25 


52: 




3 


48 




3U 




9J 


446 


56. 


7: 


27 


787 


OkXaho&a 


55 


2 6 . 


61 


138 


66. 


6: 


14 


207 


Oregon 


58 


36 . 


7 X 


93 


58. 


9: 


7 


1 58 


Pennsylvan la 


303 


34 . 


1 X 


545 


61 . 


5: 


38 


886 


Dhnf4A Tclitnci 


2 2 


3 3 . 


3 * 


44 


64. 


7: 


2 


68 


South Carolina 


80 


34 . 


7 1 


i;2 


61 . 


7t 


8 


230 


South Dslcots 


19 


39 . 


5 X 


26 


54. 


2: 


3 


48 


Tennessee 


115 


38 


2: 


176 


58. 


51 


1 0 


301 


Texas 


295 


31 


zx 


579 


62 


4: 


54 


923 


Utah 


41 


36 


hX 


66 


58 


9: 


5 


112 


Veraont 


11 


30 


.51 


24 


66 


6: 


1 


36 


Virginia 


131 


35 


. 21 


224 


60 


. 2: 


17 


372 


Washington 


86 


35 


.51 


135 


55 


.8: 


21 


242 


West Va. 


54 


40S 


79 


58 


. 5: 


2 


135 


Wisconsin 


111 


31 


.62 


227 


64 


.7: 


13 


351 


Wyoolng 


1? 


37 


. i: 


19 


54 


.3: 


3 


35 



5 , 352 34. 7J 9470 6 i: 685(4.3-.) 1 5507 



ERIC 



20 



17 



JUNE 1984 SAT 



Reading Coaprehenslon Passage on Margaret «cad 



Mjnv KHUl jnthJopologJJiJ and olher loenttric 
I'hwfvef J oi hunun swiimiinmej have emphwued the 
udiUntie* in die if x ruiej »n vaiiouj communities One 
*«> Jiim)t.uiihc<l inthrwpoiogm, Margaret Htnl, in h« 
Nwli. Vtf/« a>ij Ffmjie ^.tvej 5hij wmmafy liocnpiionof 
the j<x folei The hvHiic jhare J by a man of men and 
female partners tnio v*hiji men bung the food a/td 
wonwn prepare it is the buSK vommon pKture the wwld 
over Bui ihu ptciure van b« modji'jed, and Ihe mudino 
itonj pjovije proof that »he pattern itjelf is noi somethmj 
deeply biulo^cal " 

U IS surpnsjng that Margaret Mead, with her exlerjure 
and sniensive personal experience of diverse communjtia 
throughout thr world, should venture upon juch a duU 
ous generalization She is right in descnbing the prepan> 
tion 01 food as a monopoly for women in iKarty jU 
Communities, but the lurmJX that the pronsion of food n 
a n)an"j prero^iaiive is unwarranted In fact, an important 
distinction un be made between two kind* patlenu of 
subsisience agriculture one in which food pcoductioo a 
taken care of by women, with hitle heJp from men, and 
one m which food i$ produced by the men wtlh rebttYc^y 
little help "om women As a convenient termmoloKy I 
propose to denote thev tuo systems aj ihe female and 
male svsfems of farming 



Which of the followmg best explains what the 
author means by "the sunOantiei in the sex rofei" 
(lines 2 3) 

(A) The equahty of men's a^d women s traditionaJ 
tasks 

(8) The likenesses in patterns of dmiwn of bbor 
between men and women 

(C) The universal acceptance of the need for co- 
operation betweei. men and \ftonten witbui 
a community 

(0) The overlapping of tasks performed by men 
and women in various co-nmunitie* 

(£) The correspondence hetween a community's 
attitude toward women and Ihc naditionzi 
tasks they perform 



33 



TV author s auitude toward (he statement b\ 
Marjiret Meautsoneof 

(A) rehjctanl curuent 

(B) intrigued curiotity 

(C) rnpcctfuldsas'vetnent 
(0) apoJoRctK defensimies 
<E) mild endorsement 

Whjcn of th« foUowiRj ben descnbcj the relation 
betwctn the two para^phs in the passage^ 

(A) The second dtsfxjtes aspects of the opimonj 

proeoted in the Ora 

(B) The Koond expbuu the lopc behind ihe 

aigumeou sununarued in the finit 

(C) The iecofx!pro«ido^)earK example) of the 

genenJ statenin;u presented in the first 
(0) The stoMid qucuiom the social importance of 

the tsKies raised m the Tim 
(E) The leoond analyzes the impbcatioru for the 

fuhue of the theones descnbed in the (Irst 



I 68 OM 



TO THE »£XT PA6E 



IS 



18 



Appendix I 

Nearly 300 four-yc«r accredited colleges and 
universities have either absolute cut-off scores or specify a 
cut-off score as a leading coopooent of their adalsslons 
prog rata for all or one of their prograas» or utilize test 
scores In a qualifying nuaerlcal foraula. The following Is 
the list of those colleges and universities: 



Ins tltut ion 



Francis de Sales 



Abilene Christian University 
Akron University 
Alabaaa State University 
University of Alabaaa» Blralnghao 
Albany State College 
Alcorn State University 
Allentown College of St. 
Alvernla College 
Aloa College 
Aagelo State University 
Arizona State University 
University of Arizona 
Arkansas College 
Arkansas State University 
University of Arkansas* Fayettevllle 
Armstrong State College 
Auburn University 
Augusta College 
Austin Peace State University 
Avlla College 
Ball State University 
Beloont College 
Bcnldgl State University 

Benedictine College 

Bethany College 

Bethel College 

Black Hills State College 

Bluefleld State College 

Bluffton College 

Butler University 

California Baptist College 

California Polytechnic State University 

California State College* Bakersfleld 

California State College* San Bcrnadlno 

California State Polytechnic 

California State University, 

California State University* 

California State University* 

California State University* 

California State University, 

California State University, Long Beach 

California State University* Los Angeles 

California State University* 

California State University* 

California State University* 

California University of Berkeley 

California University of Davis 
California University of Irvine 
California University of Los Angeles 
California University of Riverside 
California University of Santa Barbara 
California University of Santa Cruz 



Clnco 
Ca rs On 
Fresno 
Fu ller ton 
Haywood 



Northr Idgc 
Sacraaen to 
Tu rl ock 



ERIC 



19 



Caaeron University 
Ca<-'leton Stat* College 
Centenacy College of Louisiana 
Central College 
Central Missouri State 
Central Florida University 
Central State University 
Chicago State University 
C'JNY, Bernard M Batach College 
Cl'MY, Brooklyn 
CUNV, City College 
CUMY, College of Staten Island 
CUNY, Hunter College 
CUNY, Q'leens 

Colorado University of Colorado Springs 
Colorado University of Denver 
Coluabla Union College 
Concord College 
Concordia College 
Dakota Wesleyan University 
Dana College 

Devry Institute of Technology, 



City of Industry 
Irving , Texas 
De ca tur , Geo rg la 
Chicago 

Loabard, Illinois 
Co luabus , Oh 1 0 



Devry Institute of Technology, 
Devry Institute of Technology, 
Devry Institute of Technology! 
Devry Institute of Technology, 
Devry institute of Technology, 
Dickenson State College 
East Central University 
Eastern Illinois University 
East Tennessee State University 
East Texas State University 
Eastern College 
Eastern Kentucky University 
Eastern Mennonlte College 
Eastern New Mexico University 
Eabry Riddle Aeronautical University 
Evansvllle University 
Fairmont State College 
Fehclan College 
Florida Atlantic University 
Florida Southern College 
Florida State University 
Florida University of Gainesville 
Fort Wayne Bible College 
Georgia University of Athens 
Georgian Court College 
Glenvllle State College 
Graceland College 
Houston Baptist University 
University of Houston, Houston 
HuBbolt State University 
Illinois State University 
Indiana State University, Terre Haute 
Indiana University, Blooalngton 
Indiana 'Inlverslty, ICokoao 
Indiana University, Northwes t ,Gary 
Indiana University, Purdue Univ. at Indianapolis 
Indiana University, South Bend 
Iowa State Un Ivers 1 ty , A»es 
Towa University of Iowa city 



ERIC 



20 



Jackson State University 

Jacksonville State fniversltv 

Jacksonville University 

John Brown 'University 

Kent State University 

Kentucky State Unisversltv 

Kentucky Wesevan College 

La Roche College 

Laoar Univers ity 

Lander College 

Lewis-Clark State College 

Loras College 

Louisiana College 

Loyola Uni ver si t y 

Maine, University of, For: Kent 

Kaine, University of, Presque Isle 

Mj>nkato State University 

Marsfield University of Pennsylvania 

Kary Hardin-Baylor, University of Belton, Texas 

Maryland, University of, College Park 

Maryland, University of, Eastern Shore 

Massachusetts, University of, Anherst 

Mcrcyhurst College 

McMurry College 

Mesa College 

Meaphis State University 
Metropolitan State College, Denver 
rfiaai Christian College 
Middle Tennessee State University 
Kinot State College 

Montana College of Mineral Science & Tcchrolcgy 

Mississippi College 

Mississippi State University 

Mississippi, University of 

'Mississippi, University for Wocsen 

Mississippi, Valley State University 

Missouri Southern State College 

Missouri, University of, Kansas City 

Missouri Western Stale College 

Mobile College 

Moorhead State University 

Molloy College 

Morehouse College 

Morgan State University 

Mcholls State University 

Mt . St . Clare College 

Mt • Vernon Nazarene College 

Nebraska, University of, Lincoln 

Nebraska, University of, Ooaha 

Hew England, University of, Biddeford 

New Mexico Institute of Mlnirt: and Tcchnologv 

Sew Mexico State University 

Kcw York Institute of Ttc>"Joloey 

New York University 

Nichols College 

North Alabaaa, Ur. IversUy oZ 

North Arizona University 

North Carolina Agricultural and Tec'inlcal 
North Carolina, University of, Ashevllle 
North Carolina, University of, Charlotte 
North Central College 
Northeast Missouri State University 



21 



North Florida, University of, Jacksonville 

North Texas State University, Deoton 

North Arizona University 

Northeastern Oklahona Unlversltv 

Northern Colorado. University of, Creenley 

Northern Illinois University 

Northern Kentucky University 

Northern State College 

Northvest«rn Oklahoea 

Northves tern College 

Ohio State University 

OklahoDa State University 

Oklahooa, University of, Noraan 

Oklahoma Baptist College 

Oklahoma Panhandle State University 

Old Domsion Doiversity 

Oregon State University 

Oregon, University of, Eugene 

Quincy College 

Pikeville College 

Portland State University 

Portland, University of, Portland 

Rio Grande College 

St. Ambrose College 

St. Francis College 

St. Cloud University 

St. Louis College of Pharnacy 

St. Leo College 

St. Mary's College of Maryland 

St. Mary, College of Omaha, Nebraska 

St. Mary's University 

St. Paul Bible College 

San Francisco State University 

San Jose State University 

Savannah State College 

Science and Arts of Oklahooa, University of Chlchaska 
Shep4rd College 
Slouz Falls College 
Sonona State University 

South Dakota School of Mining and Technolr^y 

Southeast Missouri State University 

Southern Illinois at Carbondale 

Southern Illinois University 

Southwest State University 

South Dakota State University 

Southern Oklahoma University 

Southern Colorado, University of 

South Nararene, University of* 

Southern Oregon State 

South West Texas State University 

Southwestern Oklahoma State University 

South Alabana, University of, Mobile 

South Carolina, University of, Aiken 

South Carolina, University of, Conway 

South Carolina, University of, Spartanbur^ 

South Florida, University of, Tampa 

Southeastern Louisiana University 

Southern Arkansas University 

Southern College of Seventh Day Adventist 

Southern Mississippi^ University of Hattlesburg 

Southern Louisiana, University of Lafayette 



ERIC 



22 



SpaldlQg UQlvcrslty 
Spring Arbor College 
Sue Ross University 
Stockton State College 

SUNV College of Environacntal Science and Forestry 

SUNY College of Geneseo 

SUNY College of New Paltz 

SUNY College of Old Westbury 

Talladega College 

Xarleton State University 

Tennesse State University 

Tennesse Technological Universirty 

Tenne»se, University of, Chatciaooga 

Tennesse, University of, Kooxville 

Tennesse, University of, Martin 

Texas A and Z University 

Texas A & M University 

Texas College 

Texas Tech. University 

Texas, University of, Arlington 

Texas, University ofi Austin 

Texas, University of. El Paso 

Texas, University of, San Antonio 

Thomas Moore College 

Toledo, University of 

Transylvania University 

Trevccca Nazarene College 

Trinity College 

Tusculum College 

Tushesee University 

Union University 

Valley City State College 

United States Merchant Marips. Acadeoy 

Valley City State College 

Virginia, University of» Wise, W. Virginia 

Warren Wilson College 

Wayne State University 

Weber State College 

West Liberty State College 

West Virginia Institute of Technology 

Western Connecticut State University 

Western Illinois University 

Western Kentucky University 

Western Michigan University 

Western Oiegon State College 

Wheeling College 

Winona State University 

Winston-Salen State University 

Wisconsin, University of. LaCrosse 

Wisconsin. University of, Kenosha, Parkside 

Wisconsin, University of, Superior 

Wisconsin, Uni v ersi t y of , Whitewater 

York College of Pennsylvania 



O 9 



ERIC 



23 



Appendix || 

List of Corporations who give Speci al Merit Sch olarsh ips, 
administer d hj the National Merit' Schofarshl p Progra^^ 



Abex Foundation Inc. 
Acushnct Foundttlon 
Albany International 
Alco Standard Foundation 
Allied Van Lines Henorlal Fund 
Aacast Industrial Foundation 
American District Telegraph Company 
American Express Foundation 
American Optical Foundation 
The American Tobacco Company 
Amerl Trust Corporation 
Ametek Foundation, Inc. 
Aafac Inc. 

Arthur Andersen & Co. Foundation 
Armstrong Rubber Company Fdn., Inc. 
Armstrong World Industries, Inc. 
The Aro Corporation 
Avery Inter natl on al 

Bank of Ase r I ca -C lann In 1 Foundation 

BASF Corporation Chemicals Division 

BASF Corporation-Fibers Division 

BASF Corporation Inmont Division 

Basic American Foods company 

Sell & Howeli Foundation 

Bemls Company Foundation 

Loren M. Berry Foundation 

The Black & Decker Corporation 

Blue Bell Foundat Ion 

The Bristol-Meyers Fund, Inc. 

BrockwAy Class Company Foundation 

Brown & Williamson Tobacco Corporation 

Browning-Ferris Industries, inc. 

Burndy Corporation 

Burroughs WellcoKe Co. 

Caraon Plrle Scott Foundation 

Carter-Wallace, Inc. 

Cas tie & Cooke , Inc . 

Celanes; Corporation 

Central Soya Foundation, inc 

Centronics Data Computer Corp. 

Charter Medical Corporation 

Chesebrough-Poid ' s Inc. 

The Clorox Company Foundation 

Colllnr i Alkman Corporation 

Combined International Corporation 

Combustion Engineering, Inc. 

Communications Satellite Corporation 

Consolidated Papers Foundation, Inc 

Consolidation Coal Companv 

The Continental Corporation Foundation 

Continental Grain Foundation 

Croaptoa & Knowles Foundation, inc. 

Crua and Forster Foundation 

Dart 4 Kraft Foundation 



24 



D«ta General Corpora:lo-. 

Del Mor.te C ?o ra 1 1 or. 

Dlaaond Shaarock Corporulor 

A. B. Dick Ccapany 

Dilltnghaa Corporation 

R. R. Donnelle/ & Sons Coapanv 

Dow Jones Fo>-ndation 

Dresser Founc!d:lon» Inc> 

EOT Group of Pern Central Corporatlor. 

The El Paso N'a:ural Gas Coapanv 

Equicark Corporation 

Estee Lauder I:;c. 

Ex-Cell-0 Corporation 

Fafnlr Bearing Division of 

The Torrington Conpanv 
The Fllcne Charitable Foundation 
Firestone Trust Fund 
First Fidelity Ba qc o r po ra t 1 on 
First Interstate Bank of Arlicna. N.A. 

Educational Foundation 
Flschbach Corporation 
Fischer & Porter Co. 

Florida Steel Corporation Fdn • Inc. 

Gannett Foundation, Inc. 

General Foods Corporation 

Gleason Kesorial Fund, Inc. 

W.W. Grainger , Inc . 

GrandMet USA, Inc . 

Gre * American Insurant e Coopany 

Great Northern Nekoosa fdn., Ini. 

Gulf + Western Foundat >n 

Karsco Corporation Fun 

Helene Curtis Industries, Ini . 

Henkel Corporation 

KKX Holdings, Inc. 

Hobart Corporation 

Hof foann-La Roche InC . 

Hoaestake Mining Conpanv 

Geo. A. Horael & Coapar v 

Hospital Corporation o: Aaerlca 

Illinois Tool Works Fov idatlon 

Insllco Corporation 

Interlake Foundation 

I ve y T rus t Fund 

Johnson & Hlgglns 

Johnson Worldwide Asso .atcs, Inc 

The Johnson's Wax Fund, Inc. 

Kaaa 4 Corporation 

The Kennaoetal Foundai^on 

Kenosha Found atlon 

Kerr-McGee Corporation 

Kidde Consuaer Durables Corp. 

Kidder, Peabody i Co., Incorporated 

Knlght-Ridder Newspapers, Inc. 

Kra f t , Inc . 

Leeds & Korthrup Foundation 

Lennox Foundaclon 

Llbbv, McNeill & Llbbv, Inc. 

The Liberty Corporatlor Fj'Jidatlon 

ThoDas J. Lipton Founds 'ion. Inc. 

Lloyds Bank California 

Loews Foundation 



ERIC 



Ox? 
4-0 



25 



The LTV Foundation 

Macccl Foundation 

The McGraw-Hill Foundation, Inc. 

Estate of John E. McKeen 

McKesson Foundation, Inc. 

McNellab, inc. 

Mellon Bank N.A. 

Edwin T. Meredith Foundation 

Mldland'Ross Foundation 

Minnesota Mining and Manufacturing Co. 

Mitchell Energy & Developaent Corp. 

The Modlne Foundation, inc. 

Monsanto Fund 

Morgan Guaranty Trust Company 

G. C. Murphy Company Foundation 

Murphy Oil Corporation 

Nabisco Foundat Ion 

Nalco Chealcal Coapany 

National Distillers Distributors Fdn. 

National Medical Enterprises, inc 

National Starch & Cheolcal Fdn., Inc. 

Nestle Foods Corporation 

New Jersey Manufacturers Insurance Co. 

Norfolk Southern Foundation 

Ortho Pharnaceutlcal Corporation 

Owcns-Cornlng Flberglas Corporation 

Frank E. and Seba B, Payne Foundation 

fechlney Corporation 

The Pcnn Mutual Charitable Trust 

PepsiCo Foundation, inc. 

Pet I nc orpo ra ted 

Pet ro lane Incorporated 

Pflier Inc. 

Phelps Dodge Foundation 

The Jesse Philips Foundation 

PPG Industries Foundation 

The Proctor & Gamble Fund 

Prudent lal-Bachc Foundation 

Public Service Co. of New Hampshire 

Puritan-Bennett Corporation 

The Quaker Oats Foundation 

Quanex Foundation 

Quasar/Matsushita Industrial Company 
Raytheon & Local 1505 IBEW 
RB&W Corporatlou 

The Rlchaan Brothers Foundatloti, inc 

The Rlcgel Textile Corporation Fdo. 

RJK Nabisco, inc. 

RKO General , inc . 

St. Joe Minerals Corporation 

Sandoz Corporat Ion 

Sara Lee Foundation 

Schering-Plough Foundation, inc. 

Schlegel Corporation 

Service Aaerici Corporation 

SFN Companies, Inc. 

Shaklee Corporation 

Shell Companies Foundation, Inc. 

Sleaens Capital Corporation 

Slnoonds Precision Products, inc. 

Snap-on Tools Corporation 



ERIC 



26 



Robert S. Sollnsk> Scholarship FtJn 

Sony Corporation of Aserlca Fdn . Inc. 

The Standard Oil Coopany 

The Stanley Works 

State Fara Coapanies Foundation 

Stewart-Warner FoLndatlon 

Stranaban Foundation 

The Aaron 4 Llllle Straus Fdn. Iac 

Suburban Propane Gas Corporation 

Sun Coopanv , Inc . 

Sunshine Biscuits 'foundation 

Talley Industries, Inc. Foundawlon 

The Tappan Coopany 

Technlcon Corporation 

Telfx Cooput^r Products, Inc. 

The Tloes Kirror Coopany 

Tioex Corporation 

Henry R . Towne Trus t 

Transaoerlca Corporation 

Transco Energy Coopany 

'iransvay International Foundation 

Triangle Foundation 

Union Bank 

Unl roya I , Inc . 

United Energy Resources, Inc. 
The UPS Foundation 
Wamer-Laobert Coopany 
Weyerhaeuser Coopany Foundation 
The Wllllaos Companies 
Wilson Foods Corporation 
Wilson Snorting Goods Co. 
The Wltco Foundation 
Wo. Wrlgley Jr. Coopany 
Zapata Corpora'' Ion 




27 



Appendix III 



ADDITIOHAL_S CHQLARSHIPS IN FLDENCEp BT SAT OR ACT VYAun 
Cp«rci«l listing) " 

"^""^^ aZSL 

^lL^!!!!i!!i!!,f!L!:::il:!!!°!__ *oo sat/act $ 500 to $ 7000 ea 

--?--!!!!l!!!!!,!!!.!::!^l;!!^°f_^"""i°8> 25 sAT/ACT""r2oorcrr7oorcI 

25 sat/act^'TiooTooo'cocIi 

~y^V7o"$'Vo~.oiri7' 

[[[\[i^[][TaV/11^^^^^ 

-!!!!!!!!!. s "sAT/Acr"'"r32'ooo"toc*i 

Dravo Corpor«cion S 'cat JTI'^H 7 
^ SAT S 15»000 total 

^' "sIt rs^ooo'totai 
^° "sAT/ACT"""rio'ooo'totIi 



28 



Mr. Edwards. Thi.nk you very much, Ms. Rosser. We are going 
to have the rest of the panel testify before v^^e ask some questions. 

I might congratulate Ms. Magazine for contributing to The Morn- 
ing Edition on All Things Considered. Those of us who ride in auto- 
mobiles appreciate what you're doing there. 

Our next witness and member of the panel is Dr. Nancy Cole. Dr. 
Cole is Dean of the College of Education, University of Illinois, 
Champaign-Urbana, IL. o i j 

Before you begin, we welcome the gentlewoman from Colorado, 
Mrs. Schroeder. Do you have a statement? 

Mrs. Schroeder. No, Mr. Chairman. Thank you very much for 
holding these hearings. 

Mr. Edwards. Thank you. 

Dr. Cole. 

STATEMENT OF NANCY S. COLE 
Dr. Cole. Thank you. 

In my various professional roles I have been a test maker, a test 
critic, and a student of and writer about the technical issues in- 
volved in attempting to judge whether a test is biased or not 
against some special group, often a group defined by race or 
gender. 

Unfortunately, I am not able to appear before you today with 
simple answers and simple solutions to the very complex questions 
of standardized test use and race and gender differences. In fact, 
although my background is technical, and my work has been tech- 
nical, it has led me to view the issue of standardized test use and 
race and gender differences as an issue that reaches far beyond the 
technical. In fact, the issues involved are broad social issues and at 
the very heart of these issues are the questions of how we view per- 
formance and opportunity differences of various sorts in this socie- 
ty. , 

When people became especially concerned with race and gender 
implications of standardized testing in the late 1960s, on the heels 
of broad civil rights concerns, the expectation of many was that we 
would find large artifactual effects in tests that produced the group 
differences that were being observed. That is, it was hoped that the 
group differences being observed were the fault of the tests. There 
were then, and there are now, bad tests, and there are bad uses of 
tests. But the stronger finding of a decade of study of tests and the 
possible bias in them has be^n that the differences are likely no 
greater in many tests than the differences all around us— in the 
way children are raised in their homes, in the schools they attend, 
and in the activities in which they engage. 

There are great differences in experiences and opportunity in 
this country by race, socioeconomic status, and gender. Not surpris- 
ingly, these differences result in differences in performance, goals, 
and a- orations, also by race, socio-economic status, and gender. 
The bigger issue by far than the tests themselves is how, as a socie- 
ty, we respond to changing the experience and opportunity differ- 
ences—whether we accept and resign ourselves to performance dif- 
ferences, or act affirmatively to try to create experiences and op- 
portunities to overcome those differences. 



ERLC 



32 



29 

Let me illustrate the complexities of the issue. Standardized tests 
generally show better performance by girls on school-related sub- 
jects in the elementary and middle school years. Standardized 
achievement tests of the school do not start with the assumption 
that the sexes should be performing equally at all grades through 
school, but set their questions based more on the curriculum in the 
schools and what the schools are trying to teach the children. At 
the early grades, the girls outperform the boys on essentially every 
subject. 

By hi^h school, the gap narrows and reversals in some subjects 
occur. There has been much discussion of the result that young 
women in high school as a group score more poorly on mathemat- 
ics tests than do young men. 

This result has raised a number of questions: are the tests biased 
against the young women at this stage? Are the schools biased 
against the young women at this stage? Are the parents biased, or 
are the genes biased? These questions are stated in terms of bias 
because many people address them that way. However, there are 
really far more illuminating ways to ask these questions. Some of 
the examples I would like to raise are the following: 

Are the tests asking questions to which young men and young 
women have been equally exposed? We often find they are not. 
Should they be limited to such questions? If young men are taking 
advanced mathematics more frequently than young women in high 
school, should the high school achievement tests be limited to the 
types of questions and courses that the two groups are being equal- 
ly exposed to? Are the questions the test asks important ones on 
which we care about performance? 

If there are group differences, one of our very first questions to 
ask ourselves is, are the tests measuring something we care about^ 
Because if theyVe not, then we don't care that there are group dif- 
ferences. But if the tests are measuring something that looks very 
important to us, then we had better worry about the implication of 
those differences. 

Are the schools providing equal encouragement to young women 
and men to take mathematics courses? There are lots of indications 
that they probably are not. Do the teachers and the counselors be- 
lieve in the importance of mathematics for both sexes and act on 
those beliefs? If there are differences in the tendencies of the sexes 
toward mathematics, what is being done to either reinforce those 
ditterences in the schools or to counter those differences? What 
should be done? 

Are parents providing equal encouragement to their children of 
both sexes? Almost certainly not. If not, what should be done to 
overcome those differences? 

Are young women less able to learn mathematics than young 
men, even after all the subtle differences of encouragement and op- 
portunity are eliminated? Even if that were the case, should we try 
to counter that by looking for ways to help young women catch up, 
or resign ourselves to the differences? 

To limit our questions to the tests and their characteristics is far 
too narrow a view of the issue. There are a range of questions we 
s^iould ask about situations in which such differences appear. Only 
the first of these is: should the test be changed? Within this ques- 




7A-668 0 



30 



tion of changing the test, we must address issues of vvhether the 
performance the test is assessing is important and relevant to the 
use being made of those test scores. This is the usual test validity 
question. In addition, we must address whether the nature of the 
test favors one group over another in ways that are irrelevant to 
the intended purpose, or whether those differences are relevant to 
the intended purpose. In other words, are the performance differ- 
ences real ones that matter? These are the test bias questions. 

Even if we judge that it is not the test that should be changed, 
we must ask: should the use of the test be changed? Here the con- 
cern is whether the use to which the test is being put does more 
harm than good— with the social impact of the test use. Part of this 
concern involves whether or not there are alternatives to this test 
that could accomplish the goal with less negative social impact. 
Sometimes there may be. However, there are instances in which 
the alternatives to the tests could potentially be more harmful 
than the tests, so one should not assume eliminating the test in 
favor of nontest alternatives is automatically an improvement. It 
might be optimistic to assume that judgments without tests for col- 
lege admissions, for example, would automatically right the bal- 
ance between males and females in a better way than the stand- 
ardized tests do. 

Part of the issue concerns whether the goal of the test use is 
itself socially desirable. It requires a careful weighing of social pros 
and cons to reach a reasonable conclusion about the total social 
impact of the test use in college admissions, in testing teachers, 
and in testing students in schools. There is a range of types of 
social impacts that these can have, and in my view it is not a 
simple question to balance the pros and cons. Finally, part of the 
issue is whether test users are putting too much stock in test scores 
or giving meaning to them which is not justified. 

Whether or not we judge that the use of the test should be 
changed, we have the additional question: should the experiences 
leading to the test performance be changed? If the differences are 
important, relevant, and real, what are we as a society going to do 
about them for the individuals directly involved or the generations 
that will follow them. Concern with the tests has too often allowed 
us to avoid concern with this more fundamental issue. If young 
women are performing more poorly in mathematics at high school 
and college age, what should we do about it? The real need for 
strong, affirmative, positive action to create change is at the level 
of the experiences leading to test performance differences. For ex- 
ample, what should we do affirmatively in schools to encourage the 
women students to study mathematics, to help them overcome 
mathematics anxiety, to produce real opportunity for mathemtics 
learning that overcomes the variety of negative experiences with 
mathematics that young women have? 

To point to the areas I view as even more important than the 
tests is not to recommend to you that the tests should be "let off 
the hook/' We have much to learn in judging whether the perform- 
ances the tests are measuring are, in fact, important and relevant 
to the uses made of them. We have not eliminated all pos.sibility of 
irrelevant difficulty for some groups in the nature of the questions 
and the ways the tests are given. The issues of validity and bias are 



ERIC 



31 



not resolved and we should continue to press the test producers 
toward high standards and requirements of thorough evidence to 
address these validity and bias issues. However, if our attention is 
focused only here, we may miss the even more important consider- 
ations of whether a particular type of use of even a good test is so- 
cially desirable and how we must change the different experiences 
of persons of different race, socioeconomic status, and gender if the 
goal of equal performance is to be a reasonable one. 
Thank you. 

[The statement of Nancy S. Cole follows:] 



32 



Hearing Date 
Apnl 23. 1987 



Testimony of Nancy S. Cole 
to the 

Subcommittee on Civil and Constitutional Rights 
for the hearing on 
Standardized Test Use and Race and Gender Differences 



In my various professional roles I have been a test maker, a test 
critic, and a student of and writer about the technical issues involved in 
attempting to judge whether a test is biased or not against some special 
group, often a group defined by race or gender. However, my expertise has 
not prepared me to provide neat and simple suggestions about race and 
gender differences on standardized tests. My learning in this area has 
made me very humble as it has revealed an issue of tremendous complexity 
and subtlety, not conducive to easy solutions that I can find. The 
complexities are sufficient even within the realm of technical consider- 
ations of bias and group differences. However, the issues are not and 
cannot be viewed as simply technical; in fact, the issues are broad social 
issues. At the very heart of these issues are the questions of how we 
view performance and opportunity differences of various sorts in this 
society. 

When people became especially concerned with race and gender 
implications of standardized testing in late 1960*s on the heels of broad 
civil rights concerns, the expectation of many was that we would find 
large arti factual effects in tests that produced the group differences 
observed. That Is, It was hoped that the group differences being observed 
were the "fault" of the tests. There were then and are now bad tests, but 
the stronger finding of a decade of study of the tests and possible bias 
in them has been that the differences are likely no greater in many tests 
than they are all around us—in the way children are raised in their 
hojnes, in the schools they attend, and in the activities in which they 
engage. 

There are great differences in experiences and opportunity in this 
country by race, socioeconomic status, and gender. Not surprisingly, 
these differences result in differences in performance, goals, and 
aspiratlcnc also by race, socioeconomic status, and gender. The bigger 
issue by far than the tests themselves is how as a society we respond "to 
changing the experience and opportunity differences--whether we accept and 
resign ourselves to performance differences or act affirmatively to try to 
create experiences and opportunities to overcome the differences. 

Let me illustrate the complexities of the issue. Standardized tests 
generally show better performance by girls on many school-related subjects 
in the elementary and middle school years. By high school, the gap 
narrows and reversals in some subjects occur. There has been much 
discussion of the result that young women in high school as a group score 
more poorly on mathematics tests than do young men. This result has 
raised a number of questions: 



ERIC 



33 



2 



Are the tests biased? 
Are the schools biased? 
Arc parents biased? 
Are the genes biased? 

""^ u""' '^'^^'^ °f because many people dddress 

the questions. Some examples are- ^ 

"■ V,'""'^ questions to which the young men and women 

have been equally exposed? Should they be limited to^uch 
questions? Are the questions the test asks important ones o^ 
which we care about performance' f"' tanv ones on 

til ,*!;^ schools providing equal encouragement to young women and 

K'?,J°/'^r'''""'^'" ''° teachers^nd counselors 

believe in the importance of mathematics for both sexes and act on 
those beliefs' If there are diffe.ences in the tendenc es of the 
sexes toward matheir-atics, what is being done to reinforce thosi 
tendencies or counter them? What should be dene' 

-- Are parents providing equal encouragement to their children of 
di^?erln«1' '^""^ '° 

- Are young women less able to learn mathematics than young men even 
after all the subtle differences of encouragement and cppSrtumty 
are eliminated? If so, should we try to counter that by oo ng 
d°?ferences?° ' '""'9" ""'""Ives" to the 

To limit our questions to the tests and their characteristics is far 
too narrow a view of the issue. There are a range of questions we should 
ask about situations in which such differences appear. ^ Only ?he first of 
these IS, Should the tes t be changed ? Within this question we must 
address issues of whether the performance the test is as e ng is imp" - 
tant and relevant to the use being made of those test scores. Th s s the 

nature of the test favors one group over another in ways that are 
irrelevant to the intended purpose, m other words, are the^ performance 
differences real ones that matter? These are the test bias questions 

Even If we judge that it is not the test that should be changed we 
must ask. Should the use of the test be changed ? Here the concern is 
""'^i'^^'" the use to which the test is being put does mor. harm th n 
9ood--with the social impact of the test use."^ Part of° this concern 
involves whether or not there are alternatives to this test that could 
accomplish the goal with less negative social impact. There are instances 

han JhP ^t.'^'""""''!^ \Z ''''' potentially be mor " ™ S 

than the tests so one should not assume eliminating the test in favor of 
nontest alternatives is automatically an improvement. Part of theTssSi 
concerns whether the goal of the test use is itself socially desi'r b e 
It requires a careful weighing of social pros and cons to reach a 



34 



3 



reasonable conclusion about the total social impact of test use in college 
admissions, for example. Finally, part of the issue is whether test users 
are putting too much stock in test scores or giving meaning to them which 
IS not justified. 

Whether or not we judge that the use of the test should be changed, 
we have the question. Should the experiences leading to the test 
performance be changed? If the differences are important, relevant, and 
real, what are we as a society going to do about them for the individuals 
directly involved or the generations that will follow them. The concern 
with the tests has too often allowed us to avoid concern with this more 
fundamental issue. If young women are performing more poorly in 
mathematics at high school and college age, what should we do about it? 
The real need for strong, affirmative, positive action to create change is 
at the level of the experiences leading to test performance differences. 
For example, what should we do affirmatively in schools to encourage the 
women students to study mathematics, to help them overcome mathematics 
anxiety, to produce real opportunity for mathematics learning that 
overcomes the variety of negative experiences young women have? 

To point to the areas I view as even more important than the tests is 
not to recommend that the tests should be "let off the hook." We have 
much to learn in judging whether the performances the tests are measuring 
are important and relevant to the use. We have not eliminated all 
possibility of irrelevant difficulty for some groups in the nature of the 
questions and the ways the tests are given. The issues of validity and 
bias are not resolved and we should continue to press the test producers 
toward high standards and requirements of thorough evidence to address 
these validity and bias issues. However, if our attention is focused only 
here, we may miss the even more important considerations of whether a 
particular type of use of even a good test is socially desirable and how 
we mi;st change the different experiences of persons of different race, 
socioeconomic status, and gender if the goal of equal performance is to be 
a '^easonable one. 



ERIC 



35 



Bio on Nancy S. Cole 



roii.nf % r/°\'' ^ Educat)onal Psychology and Oean of the 

College of Education at the UmversUy of Illinois at Urbana-Champaign, 
Dr, Coie received her B.A, from Rice University and her M,A, and Ph,D, in 
psychology from the University of North Carolina, Chapel Hill She 

llullt r^lf^Tl' ^^^^ ' """'""^ psychologist at the American 
College Testing Program in Iowa City, Iowa where she later served as 
Director of Test Development and Assistant Vice President for Educational 
and Social Research, In 1975, Dr, Cole joined the faculty at ?he 

m^Jh'h'I^^ Professor of Educational Research 

Methodology and later Associate Dean of the School of Education She 
assumed her present position in 1985, 

The focal points of Cole's research and publications have been the 
measurement of vocational interests of young men and women, issues of bias 
in testing, and other general problems and issues of standardized 
achievement testing. She is author of a forthcoming chapter, "Bias in 
Test Use, in the third edition of Educational Measurement , edited by R 
L, Linn. ■ 

Dr, Cole was president of the National Council of Measurement in 
Education in 1935 after previously serving on its board of directors She 
has been Vice President for Division 0 of the American Educational 
Research Association (AERA), and Member-at-Large on the AERA Council 
This spring she was named President-Elect of AERA and will assume the 
presidency of that organization in the spring of 1988, 



ERLC 



36 



Mr. Edwards. Thank you very much, Dr. Cole. Our next witness 
is Dr. Diana Pullin, associate dean, college of education, at Michi- 
gan State University, we welcome you. Dr. Pullin. 

STATEMENT OF DIANA PULLIN 

Dr. PuLUN. Thank you. Congressman Edwards. 

Let me begin by indicating that I come before you both as some- 
one who is an academician who has done research on the public 
policy implications of the use of standardized testing, and also as 
someone who has served as plaintiffs attorney in a number of civil 
rights class action lawsuits across the country, challenging the use 
of standardized tests to make critical determinations about individ- 
uals. Let me also say that I share Dr. Cole's concern that some of 
the issues with which we need to be dealing with are issues con- 
cerning the nature of the test instruments themselves and the pow- 
erful influence those test instruments have. But, in addition, I 
think many of the questions we must address also concern the 
extent to which individuals taking the tests have had full and fair 
and equal educational opportunities to prepare them to compete 
successfully in the battlefields upon which these tests are being 
used. 

I would also like to focus not only on the question of testing in 
higher education and high schools, but also on the very widespread 
use of testing that extends from kindergarten through grades 12 
and into higher education. 

As of the last time I took a count, which was late in 1984, 19 
States had initiated tests to determine v/hether or not to award 
regular high school diplomas to students. Eighteen States were 
then, and I believe approximately 34 States are now, relying upon 
standardized competency tests to make determinations about ini- 
tial teacher certification, entry into or exit from teacher education 
programs, and to determine whether veteran experienced success- 
ful teachers can retain their teaching certificates and their employ- 
ment in the Nation's classrooms. 

In addition, several Southern States and a number of local school 
districts across the country are using what might be termed ''ready 
or not'' testing to determine the eligibility of young children for 
entry into either kindergarten or the first grade. Promotional gates 
testing is being used in at least five States and numerous local 
school districts to determine whether students can be promoted 
from grade to grade. Achievement testing is used in almost every 
school district to make tracking or ability grouping determinations 
for class placement for students. 

The SAT and ACT are being determined to use entry into higher 
education, and with increasing frequency in our latest mode of so- 
called educational reform, tests are often being used as the sole cri- 
terion for determining entry into the growing number of programs 
for gifted and talented students in this country. 

Finally, State and Federal laws designed to reform the delivery 
of special education services in particular the Education for the 
Handicapped Act, which is designed to serve students with handi- 
capped conditions has resulted in the increased use of tests to make 



ERLC 



40 



37 



diagnostic and placement decisions about students who are consid- 
ered potential candidates for special education services 
Ko^n i2 i^^^^/^^^?"^,"'^"'^^^^^ ^'■e layered on top of, and have 
cfTnH*''^-'^ P^'^ '1^^^''^' of t^-e very large amount of 

standardized testing that was already going on in our schools for 
i measuring State or local progress in educational 
achievement, for gathering nationwide data on education pro- 
grams-through the National Assessment of Educational 
i-rogress-and for conducting various independent educational re- 
searcn, assessment, and program evaluation efforts. And, all of 
these standardized testing activities are applied on top of the very 
considerable amount of classroom testing done with teacher-made 
S- i ■ ^^-^ '".^"y ready-made tests that come from book 
publishers in conjunction with many textbook series, particularly 
reaaing text series. 

While I regard most classroom testing as relatively benign, the 
stS/^n'Hr K standardized tests for the purpose of monitoring 
student and teacher accountability and for making critically impor- 
tant decisions about individual students and teachers is provoking 
growing controversy This controversy has focused in particular 
=fo A- °/ testing requirements and the uses of 

standardized tests for minority students. 

ct.^f It ^"""^P f"'" '"^"y years, the performance of minority 
students on many, if not most, of these measures is often dramati- 
cally lower than that of their white counterparts. There is a con- 
cern growing among many of us that the new testing program may 
serve not simply as educational measurement devices but may in 
IdurT.'J' f^u in redefining the nature and content of 

wli^h f ^""^ influencing the educational opportunities to 

Which students are exposed. 

While there is a good deal of information concerning these vari- 
^^^l^^f^' '"^ u""ply ^""S to your attention and highlight 
th^colin?™^ '''"""'""^ °^ standardized tests across 

For example, if we begin with the area of special education pro- 
fr-T^' lu^^ P^'"^ both the legal and educational systems op- 
erate on the presumption that programs for students with handi- 
nh5 .conditions are made available to those who meet particu- 
lar physical or medical criteria and who are therefore eligibile for 

S^nffK"''^^°"^^-'®!:^''=®'u"°^''^«'"' it is now quite dear that 
most of the determinations hinge heavily, if not exclusively, upon 
trt fL stf "da.-fd'zed test instruments. This is at least in some 
part the explanation for the following kinds of demographic phe- 
nomena that are occurring. ^ y ^ i^nc 

In California, for example, in 1979, in a situation challenged in a 
fi^f l^'^l^jt. Larry P. versus Riles, black children repre- 

S .f°P^i f^''"^ 10 percent of the total student population in the 
State of California. On the other hand, they accounted for approxi- 
mately ^5 percent of the enrollment in classes for students labeled 
as educable mentally retarded, a situation the Federal courts even- 
tually found to violate Federal statutory principles. 

It one were to look across the country, a recent analysis of data 

EduSnn^^n?" ^'l ^^'^ ^'^^^^ the U.S. DepaJLent of 

Education indicates the following overall data concerning place- 



38 



ments and average rates of placement into classes for the educable 
mentally retarded: 

The overall rate of placement for students into those classes is 
about L.50. The placement of v^hite students is at a rate of approxi- 
mately 0.87. The placement rate for Hispanics is 1.31, and the 
placement rate for blacks is 2.44. A similar analysis on a district- 
by-district basis indicates that in many large city districts, and in 
many southern districts, those rates are much higher in terms of 
the disproportions for black and Hispanic students. 

On the other side of the coin, hov^ever, to look at progr'^ 
the so-called gifted and talented students, the overall pi-" c 
rate is about 4.70. Hov^ever, the v^hite p'-^ ement rate ib ^ 
blark placement rate is 2.61, and the H'bpanic placement late is 
2.57. 

If one were to look v^ith particular focus upon Georgia, a State in 
which it is my understanding standardized that test scores are the 
sole determinant of placement into classes for the gifted and talent- 
ed, one finds, for example, that in the City of Atlanta's public 
schools, whites outrepresent blacks in gifted and talented place- 
ments by a rate of approximately 7 to 1. In Charlotte-Mecklenburg, 
NC, whites outrepresent blacks at a rate of approximately 11 to 1. 
On both sides of the coin, either at the special education end of the 
provision of services, or at the gifted and talented end of the provi- 
sion of services, one finds dramatic overrepresentations of whites 
among the gifted and talented population and undcrrepresentation 
of blacks in that same population, with the reverse being the situa- 
tion in special education classes for the educable mentaUy retard- 
ed. 

The issues plays itself out again when one looks at the current 
wave of efforts to use teacher competency tests to determine con- 
tinuation of certification or initial certification for individuals seek- 
ing to prepare for the important profession of teaching. The legal 
rule used under Title VII of the Civil Rights Act, of course, is that 
one cannot use that statute in the Federal court system to address 
race differences in testing unless there is a statistically significant 
difference between black and white rates of performance on the 
test. 

A frequent rule used to determine disproportion is that one 
should look for a two-standard deviation difference in such per- 
formance. Some data I dealt with just last week indicated that in 
the teacher competency test being used by the State of Georgia to 
determine initial certification and continuing certification, there is 
a 119.7 standard deviation rate difference for blacks taking the ex- 
amination. 

There are, in short, a number of considerably troublesome ques- 
tions that can be presented when one deals with this issue, and I 
do not envy the subcommittee having to grapple with the complex- 
ity of the issues presented here. I think, in making your determina- 
tion about how to proceed in your deliberations, I would ask only 
that >,iu consider very carefully not only the question of looking at 
the tests themselves and the extent to which we c.,n encourage 
very rigorous standards for validity and reliability of the tests, the 
extent to which we can attempt to minimize the use of tests as the 
sole criterion for making decisions, and for engaging in decision 



ERLC 



42 



39 



making about significant issues in the lives of the student or teach- 
er, but also to ask you to carefully consider the civil rights and 
educational implications of the use of these tests. 

Tests are becoming more and more pervasive in the kindergarten 
through grade 12 culture of our schools. They are coming to be 
very influential in the nature of the relations between teachers 
and students, and they v^ill have a grov^ing, rather than a decreas- 
ing, level of importance in the Nation's schools and in attempts to 
ensure the enforcement of the civil rights of women and minorities 
who must work in those schools. 

Thank you, 

[The statement of Diana Pullin, with attachments, follow:] 



ERIC 



40 



EDUCATIONAL TESTING: IMPACT ON CHILD:-^I:N AT P2 SK 
by Diana Pullin 

In creased Use of Standardized Tests 

A series of recent reports have accused the nation's 
pjblic schools of promoting mediocrity and generated an 
increased interest in the use of tests to measure educational 
progress. Concern about the quality of public education 
provoked an increase in test use beginning in the mid-1970's. 
Since 1975, the use of tests to make critical educational 
cecisiors about students and to implement various public policy 
goals has increased dramatically. As of the late summer of 
1984, nineteen states have initiated tc s to determine the 
award of regular high school diplomas. Eighteen states are 
relying on competency tests to make uetermi nati ons about 
teacher certification or entry into or exit from teacher 
training programs. Several southern states and a number of 
local school districts are using "ready or not" testing to 
detemne the eligibility of young children for entiy into 
Kindergarten or the first grade. Promotional gates testing is 
being used m at least ive states and numerous local school 
districts to determine grade-to-grade prorr.otion. Achievement 
testing is used m almost every school district to make 
tracking or ability grouping determinations for class 




ERIC 



41 



placei^ent. Finally, state and federal laws aesianec .o refcr.T 
the delivery of special education services to studerts witr 
handicapping conditions have increased the use of tests to .axe 
diagnostic and place,.ent decisions about students (Pipho i 
Hadley, 1984). 

Tnese new testing mandates of the past dec.-.de add to the 
large amount of standardized testing already going on in our 
schools for the purposes of measuring state or local progress 
in educational achievement, gathering nationwide data on 
educational progress (through NAEP, the National Assessment of 
Educational Progress), determning access to higher education 
(through ,.:,e SAT or ACT), and conducting various independent 
educational research, assessment, and program evaluation 
efforts. All of these standardized testing activities are 
applied on top of the very considerable amount of classroom 
testing done with teacher-made tests and the many ready-made 
tests that come from book publishers in conjunction with many 
textboo)? series. 

Classroom teachers have long relied upon the use of tests 
to mai^e assessments of individual and group progress and to 
gather information about the extent of individual or group 
educational deficiencies. Little public controversy has 
resulted from classroom testing for several reasons. Since 
teachers have available to them a wide variety of information 
about their students in addition to classroom tests, :.uch of it 
based upon direct personal observation, the concerns about test 
cse are minimized due to the presumption that teachers act not 



42 



on the basis of test scores alone but instead use a v^ide array 
of information available to them about each student, Jr 
addition, most feel that the decisions of an individual teacher 
about a student do not have significant implications for the 
life chances of that student given the many other educators 
v^ith whom the student will come into contact over the years. 

Testing as a Barrier to Educational Opportunity 

While classroom testing nay be regarded as relatively 
benign, the recent increase in the use of standardized tests 
for the purpose of monitoring student and teacher accountability 
has provoked considerable controversy. This controversy has 
focused upon the impact of the new testing requirements on 
minority students, whose performance is often dramatically 
lower than that of their white counterparts, and a concern that 
the new testing programs may serve not simply as educational 
measurement devices but may, in addition, play a major role in 
redefining the nature and content of education itself (Madaus, 
1930, 198S). 

Educators have long known that low income, minority, and 
limited-English-prof icient youth consistently denonstrate lower 
levels of proficiency on most standardized tests. Tiiese lower 
scores are in large part the result of the liiTiited educational 
opportunities traditionally afforded the nation's low income, 
minority, and ^ imited-English-prof icient students. In many 
situations m the past, low performance on a standardized test. 



43 



face A 



particularly an achievement test, would not work to 
disadvantage a student and could often be used to direct the 
student to more beneficial educational opportunities 
particularly targeted to the student's needs (.v.ac.us, et al.. 



1980) 



However, our educational history also includes 
substantial evidence that the otherwise relatively benign use 
of achievement test data to guide educational programming or 
planning for either individual students or groups of students 
can be frought with very negative unintended consequences 
(Oakes, 1985). For example, achievement test scores used to 
determine class placement have frequently resulted in the 
racial isolation of minority students in particular groups or 
tracks within school buildings. often, those so-called ability 
grouping mechanisms have been adopted by schools undergoing the 
early years of school desegregation; here, test use has often 
be^n halted by courts on grounds that the tests were being used 
as a mechanism for circumventing integration (Conmittee on 
Ability Testing, 1980). while the notion of targeting 
instruction to students' particular educational needs is the 
practice advocated by almost all educators and the concept of 
tracking or abiUty grouping homogeneous students to foster 
such an approach sounds appealing, tracking and grouping 
practices have not succeeded in proi^oting this goa'. yuch 
available evidence indicates that, rather than helping to 
foster educational attainr^ent so that students acquire „ore 



ERIC 



44 



skills and knowledge and, therefore, move up and out of lower 
tracKS or groups and into higher ones, the vast ir.a^ority of 
lower track or group placements become dead ends. Students 
rarely move up onto higher level plecement, in part because the 
diluted curriculum and instruction provided in the lower levels 
leaves the students enrolled there further and further behind 
their age peers, rather than enhancing their attainment so that 
they might catch up. The result of test use to determine class 
placement, therefore, often serves as a roadblock to access to 
future educational opportunities (Oakes, 1985; Labaree, 1983). 

Misclassifi cation of Minority Students 

Enrollments in certain types of special education 
programs demonstrate a similar phenomenon. While enrollments 
irj pre \ms for students with handicapping conditions that are 
identiijied according to physical or medical criteria are 
statistically representative of the racial proportions m the 
population as a whole, enrollments in programs serving students 
with educational ly-defined conditions frequently contain 
disproportionate enrollments of minority youth. Classes for 
students identified as having a moderate level of ivtental 
retardation are often populated witn a disproportionate number 
of minority and low income youth. In California in 1979, for 
cxairplc, black children represented only about 10 percent of 
the total student population, but accounted for 25 percent of 
the enrollment m classes for students labeled as educable 



ERIC 



48 



45 



mentally retarded. Federal courts assessing the situation 
determined that these disproportionate enrollments resulted 
from over-reliance upon standardized intelligence tests to r>ake 
decisions about special educ.-.tiop placements ( Corrur,! ttee on 
Ability Testing, 1983). 

Given these types of trends in test perfo-mance and the 
use of test data, it is not surprising that nev uses of tests 
have produced results which also include significant 
discrepancies between the performance of white middle class 
students and the rest of the school population. while this 
data provokes the same types of haunting questions about what 
happens to minority and low income children : n our schools, it 
• also evidences a problem of even more seriois magnitude. The 
new tests are being used to make decisions that are critical in 
determining the life chances of the children who take the 
tests. Indeed, m some instances, test scores are the only 
evidence considered in making significant decisions about 
students. When a written examination is the sole basis for 
determining the award of a regular high school diploma and when 
substantial numbers o£ minority students fail that test, the 
result IS a bar to entry to the job market, to military 
service, and to higher education for a significant proportion 
of minority youth (PulUn, 1984 ). Further, the prospects of 
the loss of a diploma even after twelve years of the requisite 
attendance and attainment of passing grades has apparently been 
daunting enough to provoke an mciease in the rate of school 
dropouts (.■Idcaus, 1985). 



ERIC 



46 



labile a detailed national record of results on the use of 
high school graduation tests is not yet available, the results 
from several of the states presents a vivid picture of the 
implications of the tesimg requirements for minority youth. 
In Floridar the first state using a test as the criterion for 
determining the award of a high school oiploma, the initial 
scores cn the test indicated that black students failed the 
test at a rate ten times greater than the failure rate for 
whites; statistical analyses indicated that these 
disproportions were more highly correlated with race than 
student socioeconomic status, a factor often used by educators 
to explain school performance. The first time the requirement 
was actually imposed after court orders had placed a four year 
moratorium on the use of the tests due to its unlawful effects, 
57 percent of the students who failed to meet the test 
requirenient were blac< in a student population that was only 20 
percent black (Madaus, 1983). 

Testing and the Handicapped 

Another population that may be particularly disadvantaged 
by the use of tests as high school gradbation requ i rernents are 
stuaents with handicapping conditions. Xany tires in the past, 
students with hana^capumg conditions were aDle to attain high 
school diploiras, enter into the world cf wcr>. „ and attain a 
degree of economic self -sufficiency . This sare Ccitegory of 
students now is barred from diplopias,, often because the nature 




ERIC 



47 



of handicapping conditions is such that success on a paper and 
pencil test is impossible (PulUn, 1984). Particularly when 
passage of a graduation test is supposed to represent so-called 
"real world competency,- the result ..ay be cruel for many 
students who fail the tests are able to function competently in 
the real world. Evidence such as this raises real questions 
about what It IS that the tests are measuring and the accuracy 
with which the measurements occur. The lay public, as well as 
r.any educators, place the same faith in standardized, paper and 
pencil tests that they place in the thermometers with wl.ich 
they read their body temperature. A thermometer reading of 
99.1 degrees may not necessarily meai that you have a fever; 
thermometers are not perfectly accurate and the usual body 
temperature of each of us varies such that the "normal- 
temperature of 98.6 IS simply an approximation. Education 
t^sts work m a way quite similar to thermometers. Our nation 
has recently become quite enamored with the use of tests as a 
means of measuring the success of the educational process. 
Like thermo.-^eters, tests are not always accurate and some are 
:^--^ch less accurate than others. One large difference between 
thermometers and tests, however, is that most of us would agree 
that knowing your body temperature when you are not feeling 
well may be a useful piece of information. The same cannot 
necessarily be said for educational tests. Thinking that 
something is wrong vith our schools, or with particular 
children in our schools, because of poor test perfoxn.ance 



ERIC 



48 



reitne: tells us with accuracy tnat sometning is wrong, wnat. is 
wiong, or why something is wrong. V.nen Johnny can't read,, h^s 
paients and tea^ liers ordinarily know he can't read and don't 
need e test score to confirm this fact. A more troublesome 
fact, and one which tests rarely help, is that too often we 
don't help Johnny learn to read or to read better. Tests have 
littli to do with these problems, particularly the new wave of 
tests now popular with policy makers? these tests provide no 
diagnostic information about why Johnny can't read and are 
therefore of no help m correcting the situation. 

Limitations of Testing 

The importance of understanding what it is that tests can 
and cannot tell us is critical. Not all tests are accurate 
measures of the skills and knowledge they purport to measure 
and even the more accurate tests are at best approximations. 
In addition, the public is often le-d to make generalizations 
about individuals that cannot be supported on the basis of test 
1 nf ormat i on . Teacher competency tests provide a oood example of 
the types of misconceptions that tests can generate. While 
those Working closely with a testing program may be well aware 
of the content covered by the test and the permissable 
inferences that can be drawn from test scores, the public use 
of test information may be far broader and less defensible. 
For example, the public r ight quite understandably feel that a 
teacher or a candidate fox a teaching certificate who fails a 




ERIC 



49 



teacher co„,petency test .s an inco.-rpo.or,. teacher. „owovcr, 
despite what the names of these tests imply, none of the 
so-called teacher competency tests ir. any way measure the 
on-the-job performance of teachers and none of the tests tell 
us whether teachers can in fact teach and teach well. This n-.ay 
explain w.hy m one state two of the teacners who failed a 
teacher competency test had recently been named teacher of the 
year m their respective school districts. 

Although the new tests are of little help in improving 
the education of individuals, they may e.ert a powerful 
influence upon what happens m schools. Research on the 
effects of student competency tests both here and in Europe 
indicates that the content covered on the tests tends over time 
to control the skills and knowledge covered m the curriculum. 
This result has led several coiro^entators to charge that the 
long-term effect of the use of minimum competency tests will be 
the furt.her dilution of the content of instruction such that 
the n,inimu.-ns covered on the test beco,.e the maximum limits for 
instruction (.Kiadaus (. McDonough, 1979). This charge is 
particularly disconcerting given the growing body of evidence 
that, while the proficiency of students m basic skills areas 
has been increasing over a period of years (p.edatmg even the 
introduction of most of the student co..,petency tests),, student 
ccrr,p.tence m the higher order skills of co:.plex and critical 
thir,..ing and problem-solving has been declining (nadaus, 1981). 
Given the power of externally imposed tests to influence the 



ERIC 



50 



content of instruction and the current focus of tne tests upon 
minimum basic skills, the tests may exacerbate the growing 
problem of declining achievement m higher order skills. 

The Politics of Testing 

Two r.ore general problems inherent m the new testing 
movement provide clear warning that the tests will probably not 
promote the types of educational reform and accountability test 
proponents are seeking. First, it is well to remember that, 
for the most part, the new testing mandates are the result of 
political reforms, not educational reforms (Wise, 1978; Kadaus, 
1985). The new programs have most often been imposed by 
legislators and state and local school board members. The 
programs are not the result of efforts to apply state-of-the-art 
insights from educational research into practice. Indeed, 
research m this area suggests that the new programs may well 
have more deleterious consequences, such as diluting curriculum 
content, than the wide-ranging ben fits test proponents have 
predicted , 

Second, we have only begun to understand the implications 
and ir.pact of these programs. Test proponents tout dramatic 
ncreases m test scores, particularly for minority students, 
Hcwfc I,, with insufficient data on what it is that the tests 
are measuring, we cannot know if students now, in fact, know 
more or if, instead, they have enhanced their '--st-takmg 
skills, reduced their test anxiety, or been given tests written 



51 



an 

our 



a less difficult level. r.rtnc:, ..,,, = e r,--~st .t.,os 
to .a.nta.n adequate or accurate o.ta or. student c.^opo.ts, we 
have no .ay of Knowing U pa.s rates have ,one up on part 
oecause the proport.cn of d.scouraoed students tak.ng the test 
has g-^e down (Madaus, 1981). 

Tests are appealing. They appear to afford the publ.c 
easy,, even sc.ent.f.c way of ..easu:.rg the progress of 

educational system. However whiip ^K„ w, 

tver, .vhile the public has come to 

believe that imp.o,,d test score, .epresent educational 

progress, test scores are only surrogate measures of real 

learning, the acquisition of important, useful skills and 

knowledge. To that extent, the new tests .ay appear to 

demonstrate that our students possess ..t.^e .ight stuff .hile, 

-stead, all that we have achieve, is an lUusic of 

oducation.. progress, a portrait sketched at the expense of 

-ny youngsters who are disadvantaged by these testing schemes. 



ERIC 



52 



Cawelta, G. (1978, Kay). National Co^^;--ct^?ncy Tcstin';: A Bocus 
Solution. Phi Delta Kappan , 619-621. 

Conunittee on Ability Testing. ( 1982 ). ^^int y Testing; l-ses , 
Consequences, and Controversies . (Part 1). Washington, 
DC: National Research Council. 

Down and Out in the Classroom: Surviving Minimum Competenc>. 
( 1979, January). Principal 58 , 12-59. 

Gallagher, J. & Ran^sbot ham , A. ( 1978, October). Developing 
North Carolina's Competency Testing Prograrr,. School Law 
Bui Ictin , IX, 8-14. 

Gould, S. (1981). The Mismeasure of rian . NY: W.W. Norton & 
Coirpany . 

Haney, W. S Madaus, G. ( 1 978, I>;ovei:;ber ) . Makir.g Sense of the 
Competency Testing >5ovemGnt. harvard Educational Review , 
48, 462-84. 

Houts, P. (1977). The Myth of Measu rabi 1 i ty . New York: Hart 

Publishing Company, Inc. 

Hyman, R. ( 1 984, March/April). Testing for Teacher Cc.-^.pe t ence : 
The Logic, The Law, and The Implications. Jv ^rnal of 
Teacher Education , XXXV , 14-18. 

Labaree, D. (1983). Se tting the Standard: The Characteristic s 
and Consequences of 'Alternative Student Promotional 
Poll cies . Philadelphia; Promotion Standaras Con\mittee of 
Citizens Committee on Public Education. 

Levin, H. (1978). Educational Performance Standards: Inage 
or Substance? Journal of £ducatior.a 1 Measure ment , 1 5 , 
309-319. 

Macaus, G. (1981. October). NIS Clarification IJearmg: The 
Negative Team's Case. Phi Delta Kappan ,, 6^/, i'2-9 4. 

Madaus,, G . ( 1983 ). The Courts, Valid ity, ard ."ir.i^um 

Co^ipetency Testing . Boston: Kl vi*%cr-Ni ;]oi f Pu^ilshIng. 

Madaus, G. (198S, M.ay). Test Scores as A-t-i ri s t ra 1 1 ve 
MiC'Chani s.TS zn Educational Poljcy. £llL. • :'lLhiL2,L^£2i: ' 
611-617. 

!".ada'js. G. h Airasian,, P. ( 1977 ). Issues m evaluating 

Student Oucccmes in Conpetency-Pased Grciduation F-rogiams. 
Journal cf Research & Development i n Ed u cat ion, 1 0 , 79-91. 



5o 



ERIC 



53 



"^'^^Te;t?;«' "<;Donagh, J. (1979, June). Min^nialCo^etency 
-Gul! . ""^"^^"'-'^ Assumption, . n d Un explored K. M ~T&- 

ri?^;./"r'' P"^" "^^'^ the annual conferen -^T^ 

EduL%.^na1%^o^^L^:r"Lne:"!°C0' 

McClung, „. (1979,. Competency Testing Programs- Legal .nri 
Educational Issues. Forham Law RevLw . 'o«.^^?^^ ^""^ 

''"''Iw^tch ' Phi"np?^: "^y'- """tional Bait-and- 

i>witcn. Phi Delta Kappan , 621-625. 

"''"diversity"'- ££^£iii2^I-)i - Connecticut: Vale 

°'"^Co;npy- 't"'' Race. Socioeconomic S^. t,,. 

Cor,petencY .^estina. (Resear ch Paper Series). Nati ^^IT^ 
Social Science and Law Project, Washington, DC. 

'"T:^or:ca^'per^ ^""P^^-^^X ^"ting: A Social and 
Historical Perspective. Educational Hnr^rnr.. . 3-8. 

'''^''r™' '1584, July). State Activity Minimal 

Competency Testing. ClearinghousP Knt.^ ' mnimal 

"""''th; Case'lfw'-- Competency Testing: A Review of 

the Case Law. School Law Update, chapter 3, 161-174. 

Pullin, D., Sedlak, M. & Wheeler, C. (1985). Proposals for 

iMl^^Gif^i^ ^:!!:^^ ^-'-""'^"V Sch^^M^l-lTTirihe. 

^"''EduLtinJ^'^^^- I*'^ '"P^" °^ Competency Tests on Teacher 
Certifi?^n ^""^ L^g^l I""" ^" Selecting and 

certifying Teachers. jn H. Kaberman's Research In Teacher 

Spady, W. (1977, January). Competency Based Education- A 
Banowagon in Search of a Definition. Edacational 
R esearcher , 6, 9-14. j:-aacationa _l 

^''^''r.O; "S^'- Competency Testing: Another 

Case of Hyper-Rationalizatio.-., Phi ..elta Kappln . 596-608. 



ERIC 



54 



KCA5 

617-357-6507 



Vol. = : 



ABOUT THE AUTHOR 

Diana Pullm holds both a Ph.D. m Education and a law 
degree from the University of Iowa. For the past ten years, 
she has been actively involved m testing issues as an 
education researcher, an educator, and a litigation attorney. 
She has represented parents and students, school districts, and 
teacher associations m disputes involving school testing, 
educational accountability and educational equity for minority 
and special needs students. She is most known for her 
represer^tation of the students and parents who brought the 
landir.ark federal court challenge to Florida 'a minimum 
competency testing program. 

At the present time. Dr. Pullm is Associate Dean of the 
College of Education at Michigan State University. 



CONTACTS ON BACKGROUNDER ^2 

Dr. Pamela George 
1025 Lakewood Avenue 
Durham, NC 27707 
919-489-0296 

Dr. George Madaus 

Director, Center for the 
Study of Testing, Evaluation 
and Educational Policy 

Boston College 

Chestnut Hill, MA 02167 

617-552-4521 

Dr. Vito Perrone 

Center for Teaching and Learni ng 
University of Korth Dakota 
Grand Forks, ND 58202 
701-777-2674 




55 



■^^^^•^5 Average ACT Composite Scores^ 

18.6 Overall 

19 ^ White 

15.9 Puerto Rican, Cuban, Other Hispanic 

19 1 Asian American, Pacific Islanders 

1^ 6 Mexican Aracrican, Chicano 

13 9 American Indian, Alaskan Native 

12 5 Afro-Anorican, Black 



*froin. ACT Issue Gran p(, (January 1986) 



Impact of the ACT In Mississi2£i>* 

ACT Scores for Three Historically Black Universities m Mississippi 
Mean of Entoring Freshmen, 1985-86 

Alcorn State 13 Q7 

Jackson State 14 qi 

Mississippi Valley State 12 53 

ACT Scores for Other Mississippi Universities 

Delta State 19 

Mississippi State 2I 19 
Mississippi University 

for Women 20 08 

University of Mississippi 20 83 
University of Southern 

Mississippi 19 51 



-*from ACT Report to Board of Trustees of State Institutions of Hirher 
Learning in Mississippi °' Migner 



ERIC 



56 



The 

Condition of 
Education 



1986 
Edition 



Slalisiical Report 

Center for Education Statistics 

i-dittd by 

Joyce D Sicrn and Mary i rase Williams 



U S Dcparimcnl of [ ducalioii 
William J Benncn, Sccrcury 

Office of FJucaiional Research and Improvcmcnl 
ChcMtr I; Tinr). Jr. Assisuni Sccrclary 

Ccnicr lor Lducalum Sl.ilisijcs 
Knicrson J LIin>u, Director 



ERIC 



60 



57 



A. Oiitcoines; IVansitions 

HiKh school completion bv race and 
ethnicity 

In CK.mm.ni: ,hc outcomes „r our s,hou\.. one 
imporunt ,»cjsurc ,s whether siudcnis ^rc Ml to 

finiih h gh schtx)). then u ,s doublful that 0>c> have 
lively in societ) 

Thus. „nc nut.on>c measure .>f cducaj.un in she cx 
««.m lo ^^hich Mudcnl. complete h.yh school w.lh 
classmates about the «me age The data .n ,he a 
ITcZr^ I'u' '"^""^ Pcrcer;,agev of students who 
h^ve succc»Jully completed J2th yr-dc or ,he cqu.v 
alenl at ages 19, and iges 20-24 

The public generally expect^ |8 to 19 year olds to 
njvc J h»yh sihix.I dipfuma And, mdctd. n.„s, j,, 

Table 1:8 



'''»"^vcr as tar. be sctn from ,hc tabic, many vig- 
dents a longer p.n(Kl 1,,,^ tu tomplcc ihcj 
Hh saH„,i .ju.at.on lur uampic, the fK,..n.,j« 
( I 20 to 24,ycar-OJds havmy oblamed a hirh school 
«''l>l'>nia or ,ts equivalent ,s ,bout 10 pcrtcnl«f 
points greater than ,hat lor IK to IVycar-olds 

1^.%?" ^^'r ^<""P"'Cd from labulalions fr,^, 

the Bureau t>f ,he Census Current J>opulaii.m Survn* 
Ihoc data are collected Irv.m household mierv.jM 
and mcludc infcmation on individuals who h3;c 
completed 12 or mo,e years of sch.>ol.ng or uho 
have obtained an alternative credential su.h 3S I 
General Educanonal Development (GKD) cert.ficu 




'Wsil ol t-e ytir to y«.r C.t1«r«rm com? to, m^w 

NOTE Avri Kt ft« nyuOM m tfte in^Vvj l^we V^y not 



lOtftHjSie from r« 0ct06« Current Popyi,;^^ Survey 

CurremPoP.,,,^ Survey, („.p,,\^';ruS-^^;r f"'"'"^'^'" 



ERLC 



c: 



58 



CilAKi' 1>KA - iii;;>i sihuul Loinplcttuti r.itc*^ ti> rjcc aiul ihspaiiK u)i<;in. 




SOURCE DuftauOM^t CCnlui CijrrofiPoCMjla'-OoRtpwti ScfiCS P » 

. Natjonally, slightly Ic&s than ihrt;, quancrs of all 18- and 19 year olds have completed high 
school 

• The propotuon of 20 lo 24 year olds who have cumplcicd high school has held steady at about 84 
percent since 1974 

The high schcx)l completion rate among blacU. fur buih 18 to 19 and 20 to 24 year olds, has 

* increased in the lasi decade The r^ cs for both blacks and Hispantcs still lag far behind those of 
whites 



43 



ERLC 



G2 



59 



y\ \ ^'^';y\}^*^^^^^ jL'hai-(Ctcris(ics 

I'aihcip.jlluii nitts for higher cdiic.Uion 
!)>• race/i'thnicity 

Aiiiai.jii\ h.ivt pndtd Jlioinsclvts on hjvmj. one of 
iht iiMst iltiu(»crjitt \ysicin\ ol cduuiiion in iht 
v.uild Ihi. i:u.»l ol tHUdI dttt\\ \ut M\ t^Udhlitd 

VMiiih h,i\ lull}' \Kx'f) licid .IN »)hjunv(. ol our 

tdiinuoiul \)Muii A iiKjNurc ol lln. Ildtioiidl pro 
grew low.tfj ihii ^ojI in p,iM(cip.ilnin f.iW\ in In^hLf 
cducjliuil ol vjriuus popuLiiim\ fhis idJkjI'*; itn-ks 
dl pdriitipdiiun Mics of bljcKs. jnd Mispdnics 

jpcd 18 24 Mnce mVO 



HLick 



piriicipjiuin rdic\ improved drotjutiLdlly Irotn 



1973 u> l'>7(] 



Mi\pdnics .iKo incrcdMd their pdr 



rvnuincd sDiiiowhjl lower ihjn the white r*.i V % 
pariitip 'mil idlts hdVt iriucistd since 1979 I, ^ 
.tariiupaliiin rjits dcJincd in the Ijic W70W^ 
have hrt.cn rtUiivcly suhlc sintc then ^ew-Wii^ 
diflcrciK(.N lor HispunuN since 1975. hov*ott*gf' 
not MdiiMKdlly Mi;nilKdni ft 

Cdutmn sliould ht used in intuprtlini; ihf Jju 
scnicd hue ratidl/uhnic dt^hnitioni ihc Be»^, 

of the Ccnsui usci drc noi mutually cxcIumvc Tk»-* 
Jure direct Lompdnsons bctv — " HiNpjmciBi 
v-fintN (►r hldcKs or^ not possibli. Whites anj Uifc' 
.«rt vkhnul as ruudi ^Toups. v^hcreji Hiipjnicj « 
ilefintd js jn tthnic jiroup und Cdn be of 3n> nx. 



Table 2:9 

I*JrlicipalK)n r.iti's of 18- lo 24.)tar.oIds »n higher 
education h> r.jct/ethnitity: 1970 lo 1985 



19?0 

Wi 
\m 

19?S 
1976 
19?7 
Wt 
1979 
19S0 

198? 
1933 
19514 

isas 



27? 
?t * 

?so 

?6 & 
?7 1 
i6 b 
?& 7 
?St 
?&< 
?6 7 

27 0 

U 

?8 7 



ItiO 
179 
?3 ? 
??6 
?J 3 
?01 
198 
19 * 
19 9 
19e 
19? 
i 

19 7 



t3 4 

itO 
Ifi I 
?0* 

19 9 
17 ? 
IS? 
1G& 
H 1 
16? 
ICS 
J7 ? 
179 
16 9 



- So' itii Ji)'e 



118 



ERLC 



r 



60 



CHART 2 9 



Ut'^h r cducitton enrollment ratc*s of 18- to 24 jcur olds Us r.nv I 
tthipjty 





enfO''o<3 


40 - 




3S - 




30 - 




25 - 




20 - 




15 - 




10 - 




i.- 








0 - 




197? 1 




1978 
Yea- 



Perce-l 
tnro 'eJ 
40 - 



1974 rj7e 1978 I95i0 193? Vjs4 Vi'i'^ 



I'jiliiipjtion Mies for iiiMioritiLS j.KtCJStd dunnij Jht, L.iriy 1970s 

Inc pruponiori of blj^Ks 18 lo 24 years old jlltudin^ iMjslsuoriii.iry in .iiiuin.r.s irtaui 
197^. declined after 1975, jnd has been rcbiivcl> Mable since 1978 



ERIC 



64 



61 

Mr. Edwards. Our thanks to all the members of the panel. It 
was very splendid testimony. 

The gentlewoman from Colorado, do you have any questions'? 

Mrs. ScHROEDER. Thank you, Mr. Chairman. I thank all the wit- 
nesses also for their testimony. 

I guess the question I have is— what I think I hear you saying— 
IS that while young women score lower on the tests, and the tests 
are supposedly to be predictors of performance in college, that 
when they get to college they do better. Therefore, the test is really 
not valid and we're not just whining about the fact that we haven't 
had the same background or the same math classes. It really is not 
predicting what women do once they get to college. 

Is that correct? Is that the bottom line? 

Ms. RossER. Yes, it is correct. It is not predicting. That is the 
reason that these college entrance exams are given, that they are 
supposed to predict future performance. They are not doing that 
for girls. They are also not valid in respect to their past perform- 
ance in high school, because girls are getting better grades 

Mrs. ScHROEDER. As a mother of a 16-year-old daughter, and lis- 
tening to all of them now taking their tests and coming home, it 
really is very interesting because I see thai going on. You see how 
perplexed some of them get if they're not scoring the way they 
thought they should be scoring. They are beginning to wonder if 
their high school performance wasn't valid, or if they had charmed 
their high school teachers, if maybe suddenly they're not as good as 
they used to be. They really start having incredible self-doubts 
about that. 

But I guess my frustration is, if the test isn't adequately predict- 
ing what women do when they get to college, and everybody can 
see that, why in the world don't they change the test? That was the 
whole purpose for the test and I don't understand why universities 
haven t changed it, or why they are relying on the tests so much, if 
that IS true. 

You know and I know that high school counselors tell these kids 
Hey you don t score here, you don't apply; you don't score here, 
you don t apply. I mean, it is really the key to the college door 
and everything is tied to that score. Forget what they did in high 
school as far as grades, or whether they were taking college level 
courses; none of that matters. They really hang so much of it on 
the books you buy at the bookstore, or in the way that the college 
counselor directs you on that score. So a lot of young girls are be- 
ginning to think that maybe they were a fraud, you know, that 
something has happened. 

So why haven't they changed it? One of you has been suing, and 
others have been preaching. Why do the colleges insist on continu- 
ing to use them if they don't predict, and why won't the test people 
change it? 

Ms. RossER. Well, I don't know why the test people wo^^'t change 
it exactly. I've had a lot of discussions about that with them They 
say that they feel the girls just aren't taking enough math and sci- 
ence, and if they would just take more, they would do better. But, 
in fact, girls have been taking more math and science and doine 
worse. The gap has been widening. 



ERIC 



7A-668 0-89 



c: 



62 



The colleges use these, it is said, because that way you do get a 
larger male student body, if you use these test scores, because the 
boys' scores are higher. So if you want to keep your student body 
more male than female, a very good way to do that is to use these 
tests. 

But I am very upset about the fact that girls work very hard in 
high school and college and are not being rewarded by having the 
same opportunities to go to prestigious schools, to go to research 
universities, and that their whole life, their whole work up to that 
point, can be just downgraded by one number. 

Mrs. ScHROEDER. My understanding is that the difference in the 
math gap between boys and girls is really very small, that the 
number of boys who take four years of math versus the number of 
girls who take four years of math, and then take the test, is fairly 
small. In addition, the test supposedly only goes to geometry, which 
is three years of math in most high schools. Therefore, the math 
gap really shouldn't make that much difference anyway. 

Ms. RossER. Actually, the College Board data says that 50.5 per- 
cent of girls, versus 57 percent of boys, take four years of math. 

Mrs. ScHROEDER. So it is very small. 

Ms. RossER. Yes, it is very small. Also, it is true that geometry is 
supposed to be the most advanced math you need to take for this 
test. All college-bound girls that I know of certainly take geometry 
in their sophomore and junior year. So I really don't understand 
why this math gap is happening, either. 

Dr. Cole. Could I respond to those comments, also? 

Mrs. ScHROEDER. Sure. 

Dr. Cole. I think we have misrepresented the situation if we 
leave with the impression that the college admissions tests don't 
predict college performance for women but do for men. That is just 
not the case. In fact, the data shows that women are the group that 
college admissions tests predict best for of any group, in terms of 
the relative relationship between the test scores and the college 
grades. 

The prediction phenomenon that Ms. Rosser is referring to is the 
question of the statistical relationship of the level of performance 
to the test score that's based on complex statistical regression pio- 
cedures. The sorts of differences referred to are very small. In fact, 
it is not clear from many perspectives whether they mean the in- 
terpretation she's giving to them at all. But to be left with the 
notion that the tests don't predict for women and do for men is just 
not a correct notion. 

Mrs. ScHROEDER. So you would say the test doesn't predict for 
either one? 

Dr. Cole. No, I would say the test predicts some for both. It cer- 
tainly doesn't tell the whole story, but it's 

Mrs. ScHROEDER. Well, if it doesn't tell the whole story, then why 
do they rely so heavily on it? I sit there listening to young kids and 
watching all this counseling going on. Let me tell you, the score is 
90 percent of what everbody focuses on. If it's not a good predictor, 
then why don't they change the tests across the board to be a 
better predictor for men and women? 

Dr. Cole. You see, the tests are almost as good a predictor as the 
high school record. If you are dealing with a situation of very com- 



63 



petitive admissions, with large numbers of applicants, the best n- 
formation you can find to decide who are the stuS mLriikelv 

tonZ'fu^ ^'"'S'"!-' ^'^^ ^'^hool record of the stSt and 
second the - .dardized test scores. Both are used 

i„cfJI!!f- *° ^^""^ the situation in which an 

KSr."'"' t^'^ '''''^ the high school record 

My institution puts very heavy weight on the high school record as 
wel as weight on the test scores. It worries me very much that we 
don t spend time reading what these kids write and judging letters 
of recommendation about the students. But the volume and nur^- 
bers involved often preclude that. voiume ana num- 

I think in fact, the publicity often oversells the role of the test in 
college adniissions because it is something easy to focus on and 
nf,h^i. -^^^ ^°^iu- ^ ^^'"'^ to some extent, in the 

decisioi important the test scores are in that actual 

Mrs. SCHROEDER I really don't know. My job, before I did this 
was I used to analyze tests for jobs for the Stat^ of Colorado I rt 
member going through the ones for pilots and finding out how it 
was very culturally biased. They would have timed tests I remem- 
ber one in particular where they would show you pictures of a 

lEl f what was wrong. If you had a ciacked window and you 
came from a neighborhood where cracked windows were normal I 
*° We could find things like 

Ihff I'^^'.^T^'^^l to low income and to females, 

that had nothing to do with whether or not you could fly the air 
p ane. If you were testing for whether or not you could fly the a ^- 
plane, then I would have no problem 

firSf^?f '^""^ *^jr"u^ ^'^^ Service exams when I 

first got to Congress. Then I was really able to get into them I 
■ ^® foreign Service exams that the U. S. Government 
Sc f ""J'"^*®"^ against every group in America except white 
males from four universities. Beyond that, who cares who won the 

ShX/o ?''^^t' ^" l^^^' ^"^.^h^* that have to do .vJfh 
whether or not you know how to administer a foreign aid program"? 
Nevertheless, that's what we tested for. It had no applicabflSy 
S/p^^,^^r -'^"''' employment area, which is the o d 

SrUnnS„ "''°?~^°",''"°^' T^^"^ '^^"'t test a janitor for 
t^vP ^^'^^^se- as a janitor, he doesn't 

have to know about classical music. You can test him on wax 

But I have always been very disturbed that I haven't seen that 
kind of sensitivity among colleges, and I think it is much tougS 
ZT. s becoming much more elitist now, with the tremen- 

n« nif °?'Tr°^ ^ ^^'"'^ a '°t of parents are saying to young 

people. If you can get into a top school, terrific; if you can't, we're 
not going to pay the difference." 

I don't think this is de minimis. I think this is much more criti- 
cal because it is channeling which way kids go now, just because of 

"Thft"?^l"^' ^.P^" ?y T-shirts for Christmas saying 
This kid IS his mom's; Mercedes." It's true. But for those kids you 

onln'^"'';-fH\ ""'^ ^ °f fa™li«s don't have tE 

pay the money ' ''=^°°'' '^'^ ^^^"'^ ^"^"g to 



ERIC C ; 



64 



As I listened to the counselors, it is not much different than 
when I went to school. They are hanging the whole thing on the 
score. It may be wrong, the schools may not mean it, but that's 
what they are doing, and I really salute those of you who are 
trying to get this changed. If it discriminates against men, too, 
then it's wrong. But j'oung people's whole lives are being changed 
by these test scores — black, white. Native American, male, 
female — and I think we ought to do everything we can to make 
them as accurate a predictor as possible, as we have done in the 
employment area. I think education is behind the curve on that. 

Thank you, Mr. Chairman. 

Mr. Edwards. Ms. LeRo-. 

Ms. LeRoy. Dr. Cole, to follow up on the discussion that you and 
Mrs. Schroeder were having, I agree that you can't look at test re- 
sults in isolation and just focus all of one's energies on these tests. 
You suggested that perhaps they are being missold or that the pub- 
licity surrounding them creates the perception of greater emphasis 
on these tests than is really being placed on them or ought to be 
placed on them. But the fact is that that perception is there, and I 
think it's more than a perception. 

For example, the Secretary of Education has these famous wall 
charts that we've all seen — and I can't unfold it; it's too big — which 
basically evaluate the States on their educational performance 
based on SAT scores. So that entire school districts perceive them- 
selves, and obviously, individual teachers, perceive themselves 
based on their students' performance on these tests. And I assume 
teachers are evalu, vted on them and individual schools are evaluat- 
ed on them. 

What can be done to get around that kind of problem when the 
chief educator in the country is, in fact, contributing to the prob- 
lem? 

Dr. Cole. When you ask that question, we move from the issue of 
what's wrong with the test to what's wrong with the use of the 
test. Every issue that Dr. Pullin raised, for example, is a question 
of social policy with respect to the use of the test. The Secretary's 
wall chart is an issue of the social policy implications of the use of 
the test, and the wall chart is an abysmal use of the SAT. It's 
almost a ridiculous use of the SAT scores. 

Most of the concerns that Dr. Pullin raised are concerns about 
the overuse of tests, the use of tests that I think can have inappro- 
priate social consequences that I'm very much concerned about. 
That is still a different question than the question of how we 
should change the SAT, if we should. It's a question of whether we 
should use it in certain ways, even if we had it perfect, the way we 
wanted it. It still would not work for the Secretary's wall chart. 

Ms. LeRoy. V/ell, I don't want to look at those issues in isolation 
because I think they go together to create a problem for education 
in this country, with respect particularly to women and minorities, 
but also education generally. 

I would ask you what could be done to reduce bias in the test, 
assuming that it's there, but I also want to ask you what can be 
done to deflate some of this overemphasis on or misuse of these 
types of tests. I realize that may be a Ph.D. thesis here. 



ERLC 



68. 



65 



Dr. CtoLE. Well, that s a very hard question for me to answer. In 
fact, in answering it, I think I would come back to trying to under- 
stand why we came to such a wide use of tests. It is my view that 
I!ttf"Ji-fr° e .we were unwilliag to impose standards and 

make difficult professional judgments in other ways, in better 

system. W*^ came to use standardized tests 
for admissions to colleje more and more because we were more and 

r/litv"n?l'v.'°"' °^ ^i^^^ set in high school and the 

^ UT courses the students are in 

Vfe use tests for graduation from high school because we're sus- 
picious that the educational system hasn't set standards for itself 
internally to use the better information. We use tests for promo- 
tion for the same reason. We use tests for identifying kids for spe- 
cial education classes, at either end of the scale, because we 
system " professional judgment of the educational 

I think, fundamentally we are not going to be able to thwart 
this overuse of tests until we make some serious changes in the 
quality of the educational system and the standards that we have 
internal to the system, where we can use better information than 
just exte-nal standardized tests to support some of these decisions. 

but that IS not an easy solution. You see, one of my dilemmas in 
ansvvenng your question is what would be happening to college ad- 
missions without the test? What would be happening in putting 
^ u P^g^^f"? without the tests? I am concerned that it 
fr-T ^? worse. I am concerned that, without the tests, at 
hot ^-AIh^ ^ '"""^ b^^^"^ ^'^^ that ought to be in 

duded^ programs, we could have even more of them being ex- 
Mrs Schroeder. Would counsel yield on that poinf? 
Ms. LeRoy. Yes. 

Mrs. Schroeder. What I don't understand, though-I don't think 
we re arguing to do without the tests. The question is, there is a lot 
^L'ni'^"'^i T t'^^, tests and we understand that we would like to 
«v.fpm «n f u 'V P"^'^^ private education 

«nrP f^o? '"^'^^ '^^y °" that more. But why not make 

sure the test is as fair as it can be, then? 

Dr. Cole Well, we certainly should do that. But you have riven 
nhf="!ff ? f .T"?* there are other counter-exam- 

f hlff o^r dilemma.'""' "^""^ ^^-^'^ ^« -^"^^ 

m«T/if^'^""°''°''^- ^'S^^- ■^^'^ ^"^"g^ the worst and try and 
make it more gender-ne-'tral and minority-neutral 
Dr. Cole. Absolutely. 

Mrs Schroeder. I think we're really in agreement 
Ms LeRoy Let me ask .he other two witnesses the same general 
question, and that is, what can public policymakers, people in Con- 
bP Ho^n'.P'"^'^ the Department of Education, what should thev 
.,!ri^f ^' J '^^^ anything be done to assure both test equity and 
validity and proper use of the tests? 

Ms. RossER I feel these tests should be predictive. They should 
predict what they're supposed to. Unless we feel that grades are 
comp etely erroneous, which I don't think anybody dols, I thfnk 
that tne tests should correlate with the grades that students get. 



ERIC 



66 

both for women and for minorities, however that is done. I feel that 
is what Congress should be looking into and requiring of test pub- 
lishers. 
Ms. LeRoy. Dr. Pullin. 

Dr. Pullin. I think, as Dr. Cole suggested, this is a complicated 
issue. I think the testing industry could do far more than has been 
done to alleviate some of the problems of unfairness. 

I would ask Congresswoman Schroeder to think about what she 
means when she talks about her goal of fairness and to be more 
explicit about it, because there is a good deal on those tests that is, 
in fact, representative of a culture and reflective of a culture, the 
culture of schools, and it is predominantly a white male culture. To 
make it look more white-maleish is not what I think any of the 
three of us are talking about. 

When I think about the kinds of things that this body is able to 
do to address the kinds of issues we re talking about, they are, for 
the most part, the kinds of legalistic approaches that have been 
used to some extent successfully in the past. As the Congresswom- 
an noted, we have made some fairly substantial gains in the em- 
ployment testing arena. There is some discussion that those gains 
will be lost because the EECC Guidelines are under discussion for 
considerable dilution in terms of validity and reliability require- 
ments. I think that would be a terrible disservice to the kinds of 
populations weVe concerned about here, to allow those to be dilut- 
ed. 

Similar kinds of more rigorous standards need to be employed in 
the educational testing arena. To some extent they have been. For 
example, if you look at the regulations under the Education of the 
Handicapped Act, although these are not widely enforced, there 
are very specific regulations talking about lack of bias and talking 
about use of multiple criteria information to make determinations 
about individuals. Those kinds of standards are available. They are 
not available in every arena, among the educational arenas that 
we're talking about, and they are not being enforced. 

Ms. LeRoy. Thank you. 

Mr. Edwards. Does minority counsel have any questions? 
Mr. Slobodin. Thank you, Mr. Chairman. 

Good morning. I wanted to first ask Miss Rossc ' and Dr. Pullin, 
do you currently have, or have you ever had, any affiliations with 
the Fair Test organization? 

Ms. RossEF.. I have, for the last three months, been a consultant 
to Fair TesL That consultancy actually ends today. 

Mr. Slobodin. OK. 

Dr. Pullin? 

Dr. Pullin. I know the people at Fair Test and I have talked to 
them about these issues in the past. 

Mr. Slobodin. I^t me talk to Miss Rosser for a moment about 
LSAT's. Reading from a passage in a book by Cynthia Fuchs Ep- 
stein, called *'Women in the Law**, she writes here— she is making 
the point that ''the problem raiseu by preference for women is 
unlike the problem of other minority group preferences because 
women applicants have generally been better qualified than men." 
Then she proceeds to support that proposition, that ''the average 
law school admission test score did not vary significantly by sex.** 



67 



T Q% counselor reported in a study of 

LSAT scores for 1973-74 that the mean test score for both men and 
women was 527. Of registrants for the LSAT in 1973, 75.2 percent 
rj"it^Tv"V^^-^ P^^'l^"] were women. In 1974, 75 LSAT surveys re- 
vealed that women had a slight edge over men in law school admis- 
sion test scores. The mean test score for male registrants was 522. 
while the same score for women was 524. 

c«f f^f w^^^ ^" LSAT's by the standards 

set for law students." A 1972 study commissioned by the law school 
fi:"i"f'i°r:n^ determined bias of sex and the tests showed 

that 1,1!>U males used as a comparison group, with 1,165 females 
scored approximately 10 points lower and had a mean writing abili- 
ty score approximately 7 points lower than women. Women did 
better on four of the six sections in the LSAT-reading comprehen- 
jT'J!? recognition and sentence correction. Men 

did better on one section, data interpretation. The two groups 
scored about eoually on the princioles and cases section. 
What IS the story here? 

Ms RosSER. Well, I am not an authority on the LSAT, but I have 
talked to people who have done research on this. They say that the 
women who take this test are probably about 10 times better ver- 
bally than the men and, in fact, they should be doing even better 
on the test than men than they are doing. 

Mr. Slobodin. Yes, but in your testimony you mention as an ex- 
questioSs-5- ^^^^ "^^^^'^S comprehension 

Ms. RosSER. I was talking about the SAT there. 

Mr. Slobodin [continuing]. SAT's. Ard the women score higher 
on reading comprehension. That's where you're pointing out where 

R/f IS, but that s where women are scoring higher 

My question to you is, let's talk in terms where there has been a 
disparity, and that's in the math area. 

First of all, Nvhat is a bias? I mean, if there's a one point differ- 
ence between the sexes, would you consider that bias? How about 
five points? Of what threshold are we talking about bias? 

Ms. KossER. I tnmk when it has a major impact on people's lives, 
that s negative, that is bias to me. I think it's having a major 
impact on women's lives. That, to me, is bias. 

Mr. Slobodin. But what is the impact? Where are women not 
getting into-Are you saying they're not getting into Harvard? 
LiCt s name some schools here. 

Ms. RossER. All right. Fewer women get into Harvard thao men. 
^ewer women are getting into all the Ivy League schools thar men 
Fewer women get into the other prestigious schools than men! 
Ihere is actually national data on that. 

Mr. Slobodin. How do you explain this rise, then, in women 
coming into the law schools? In fact, you say at the beginning of 
your testimony here, "What struck me first when I looked at these 
tests v/as the overwhelming number of males that populated 

J^TrrrA°f'^^°'^*TI^,.^r^3^^^^ traditional occupations like 
doctor and lawyer * * *" What has happened in the last 10 years 
has been phenomenal growth. ^ 

inYl?p"K^°"f°"'^'^o ^ phenomenal change 

in the legal profession? We ought to be taking a look at that and 



ERIC 



68 



saying, **Well, it's working in the legal profession. We have women 
now taking an interest and becoming lawyers. The tests aren't 
stopping that growth/' We ought to be looking at ways of extrapo- 
lating that for math and science. How does it follow that we need 
to change the test? 

Ms. RossER. Well, yes, there has been a tremendous growth in 
the legal profession. It's about time we had more women getting 
into the legal profession. 

One of the things about the LSAT was that there was a lot of 
math on it. It didn't relate at all to being a lawyer. And because of 
a certain amount of testing reform pressure, some of that math 
was taken off. 

Mr. Slobodin. When was that? 

Ms. RossER. Oh, within the last five years, I believe. 

Mr. Slobodin. Yes, but this was before. I'm citing statistics 
before they took the math out. 

Ms. RossER. But some of those statistics also show that women 
did less well, and also that they 

Mr. Slobodin. Yes, but they still scored as well, if not better, 
than men on those tests. 

Ms. RossER. But they did less well on the data. 

Mr. Slobodin. So what? They did better than men in all the 
other sections. When you combine them, they actually had a slight 
edge. 

Let me go on to an article which publicized the report you re- 
leased last week in the New York Times. The reporter writes here 
in the article that your study "offers no analysis of standardized 
tests and gives no examples of biased questions. The findings are 
based on the conclusion that, because girls earn better grades than 
boys in high school and college, they should do as well or better on 
the tests." 

Now, I would like to discuss this premise. Have you studied 
whether or not there is any bias in grading in high school courses? 
Why should we consider that more reliable than a question that 
asks ''what's the circumference of this cup"? 

Ms. RossER. I think that girls, historically, over the years have 
been getting better grades in high school and college, and they 
have been doing less and less well on these test^. I think that is 
bias. I don't think that I have to come up with specific questions 
that are biased. I think this is something that the people who know 
about tests will come up with. ETS knows which questions they 
are. I think we should really look at the effect this is having on 
people's lives, and that is a bias effect. 

Mr. Slobodin. You don't see the potential for bias in— What 
about Dr. Cole's point, that we could have the potential for a lot 
more bias without the use of these tests? 

Ms. RossER. Well, I think that society is biased against women. 
There's no question about that. And I think they are doing quite a 
good job of overcoming this handicap in the classroom. 

Mr. Slobodin. Let's talk about the Educational Testing Service. 
We are going to get some testimony from Dr. Dwyer, where she 
says about 80 out of 125 people that are involved in developing 
these standardized tests are women. Why would these people that 



ERLC 



72 



69 

?wntex?°^'"^ "^^""^ ^ ^'^'^ ^""•'^ hurt their 

f Jfn,.??-^'''*- P^n""'''^y P^°Pl^ who are picking those 

Scrc^atSft^u^^ ' '^"'^ ^'--"^ 

in^LfnTyou? ■ ^""'^ ^ ^P'^'^"'^^- 

Ms. RossER. I'm speculating, and so are you 
Mr. Slobodin. It's not based on evidence; it's speculation 
JNow, have any of your studies controlled for level of prepara- 
tion-ior instance, comparing girls who have taken the same math 
courses, the same years of math, as boys? 

Ms. RossER. Yes, the College Board does that. Thoy publish volu- 
minous data on that, and they control for that. ^ 

fw'''f^o°^°°i''V^' ? f^'^*' I have that study, h showed 

that the male-female gap in SAT mathematical performance 
shrinks considerably when differences in quantitative high Xol 
course work are taken into consideration. That point may not be 

^SZ'l^^IS^X '^'^"^ '''' '""'^ 

an||K^^^ 

in that area. Females are still doinf^ worse 

Mr. Slobodin Well it says here,^when they control for the same 
level of preparation, that gap is cut considerably 

1 see my time is up. ' 

Mr. Edwards. We have other witnesses. But we appreciate very 
Eg hre today.'^'' contribution. So thank you ve?y much fo^ 

A.^L'^'q"""?^^'' ^^V^ Gretchen Rigol, Executive Director of 
Access Services, Qjllege Board, New York, NY; and Dr. Carol 
Fr.//- "'^^p'"*^^^ director. Test Development, School and Higher 
ton NJ ^™^'^"^'' *he Educational Testing Service in Prince- 

sJSf n?lff!r^"l^''; ^^^^'■' welcome you. Do you solemnly 
u L l \u *h^^ testimony you are about to give is the truth, 
the whole truth, and nothing but the tru'^h'? 

Ms. Rigol. Yes. 

Dr. DwYER. Yes, I do. 

Mr. Edwards. Thank you. 

Miss Rigol, I believe you are first. 

EXECUTIVE DIRECTOR. 
mvvL^o ^l^l^P^^^' COLLEGE BOARD; AND CAROL ANNE 

TESTING SERVICE ^^^^^^TION PROGRAMS. EDUCATIONAL 

Ms. Rigol. Thank you, Mr. Chairman. 

Serv?ceTof^hP r'£'"R^^^?- ^ ^'".executive director for Access 
oervices ot the College Board, a position I have held for 6 vears 
My division IS responsible for directing the Admissions Testing Pro- 
gram, which includes the Scholastic Aptitude Tesr^rior to joining 



ERIC 



70 



the College Board, I was Director of Admissions at Pratt Institute, 
and also served as an admissions officer at Goucher College and 
Mount Holyoke College. 

Founded in 1900, the College Board is a national, nonprofit asso- 
ciation of more than 2,500 colleges and universities, secondary 
schools, school systems and educational associations. A description 
of the full range of our services and programs is attached to my 
testimony. 

One of the original purposes of the College Board was to provide 
a series of common entrance examinations that would be available 
to students from all parts of the country, not just those few who 
attended well-known preparatory schools. Those first "College 
Boards" represented a major step toward making higher education 
accessible to all students— a goal that is still of paramount impor- 
tance to the College Board and its member institutions. 

Today, the College Board's most widely used test is the SAT. A 3- 
hour, multiple choice test, the SAT measures developed verbal and 
mathematical reasoning abilities necessary to successfully pursue 
college-level work. It provides a common yardstick to help admis- 
sions officers understand an applicant's academic readiness for col- 
lege-level work as they review transcripts from students who have 
taken different courses in the more than 25,000 secondary schools 
throughout this country. 

Mr, Chairman, I welcome this opportunity you have provided to 
address the complex and complicated issues of fairness in testing 
and differences in scores among groups of test takers. 

Average scores of various groups taking the SAT have been pub- 
lished for many years. Twenty years ago, the average SAT scores 
for women were slightly higher than the average scores for men on 
the verbal section of the SAT. This difference ranged from 2 to 7 
points. But even then, women's average math scores were consider- 
ably lower than men's scores, between 41 to 47 points lower. 

The first time women's average verbal scores fell below the aver- 
a£:e scores of men was in 1972. The differential in that year was 
two points, and for the next several years the difference fluctuated 
between three and six points. Then, in 1978, the difference in- 
creased to eight points, and in 1981, it became 12 points. Although 
there have been slight fluctuations during the past 6 years, the dif- 
ferences have remained between 10 and 13 points. I think it is im- 
portant to remember that the total 61 point score differential that 
is so often mentioned includes 50 points on the math that has been 
evident for at least two decades. 

I should emphasize that these scores are group averages and, as 
such, they do not reveal the different abilities of individuals within 
these groups. Distributions of scores reveal that the individuals 
within all groups display the full range of developed abilities, from 
highest to lowest. 

Average score differences are of great interest, but I would like 
tc state now that, based on the best available data, we do not be- 
lie, e these differences are caused by bias in the tests themselves. 
In many ways, this hearing and the ongoing investigations about 
differences in score performances are similar to the work undertak- 
en in the seventies to help educators understand the overall score 
decline that began during the late sixties. Just as the Advisory 



EMC 



71 

Panel on the SAT Score Decline rejected the notion of any one 
single cause for the overall decline in SAT scores, I suspect that 
there are probably numerous factors involved in this inquiry 

What are some of the possible reasons for the score differences' 
Une IS that the number of women taking the SAT increased signifi- 
cantly in the early seventies, just as the number of men decreased. 
Ihe growth in the numbers and proportions of women SAT takers 
from only 44 percent in 1964 to 50 percent in 1975, and 52 percent 
since 1981, corresponds exactly to the time periods in which the 
scores of women declined. This past year, there were about 40,000 
more women than men who took the SAT. When dealing with av- 
erage scores on tests that are taken by self-selected populations, 
rather than balanced samples of students at all ability levels, it is 
usual tor higher proportions of test takers to result in lower test 
scores. 

We believe that the increase in the number of women taking the 
OA 1— presumably because more women are considering a college 
education— should be regarded positively. 

Another reason for the score differences between men and 
women is also related to shifts in population characteristics. The 
larger number of female test takers have, in recent years, included 
more women from racial and ethnic minorities. For example, of the 
nearly 80,000 black students who took the SAT in 1985, 60 percent 
were women Women represented 55 percent of the American Inc" - 
ans who took the test, and the percentage of women in the Puerto 
Kican and Mexican-American groups were 54 and 53 percent, re- 
spectively It IS well recognized that the educational opportunities 
available to many of these minority students are not the same as 
those available to white students. 

Parents of the females taking the SAT had slightly less formal 
education than the parents of the male students taking the tests, 
and female students tended to come from families with lower 
median incomes. We also know that, as a group, the women taking 
the SAT were less likely to have followed an academic or college 
preparatory program in high school and that, on average, they took 
fewer years of study in academic subjects. 

The courses women take in high school, as we have been discuss- 
ing, are also a factor in explaining some of the score differentials 
f or example, the more math students take in high school, indeed, 
the better they do on the SAT math section. The fact that women 
take fewer math courses than men probably explains a large part 
of the 50 point difference in SAT math scores. Women also take 
fewer courses in the physical sciences. 

I am sure that you all share my concern that many young 
women are not encouraged to take more math and science courses 
and that so few consider scientifically oriented -areer paths. It is 
difticult to know exactly how much of the score difference in math 
is related to these unfortunate social influences, but I personally 
am convinc-J that there is no inherent difference between men 
and women which preclude women from excelling in the area' r 
mathematics and sciences. dice . 

^u^°j"f^ the difference in average verbal scores is not as great 
hLc r £-^"ff "'"^^ ^iffi'^^'t to suggest explana- 

tions for this 11-point difference. Some is undoubtedly related to 



ERIC 



72 



the population differences described earlier. We have examined the 
test specifications and many of the items on previous editions of 
the test and have found no systematic explanations for the differ- 
ence. There are questions on which women do less well than men, 
but there are also questions on which women do better. Usually 
these items are neutral in content and do not suggest any plausible 
reasons for the differential performance. There have also been nu- 
merous changes In the test content over the years; however, none 
of those changes coincide with the times when there were signifi- 
cant shifts in scores. 

During the past few weeks, there have been allegations that the 
test is constructed to intentionally produce scores at different 
levels for various subgroups. I would like to state categorically that 
this is absolutely untrue. The College Board is committed to admin- 
istering fair, effective and equitable tests. Our members would 
accept no less. 

We use a variety of methods to detect or evaluate for tiie possi- 
bility of any potential bias in our tests. Among them are numerous 
reviews and statistical analyses that are described in my statement 
and that will be discussed in a moment by my colleague from ETS. 
I should note, however, that all of this research is not done only at 
ETS and that the College Board makes the data available to out- 
side researchers for their own analysis. 

Another method for determining if a test is fair examines wheth- 
er it predicts equally well for different groups of students. The Col- 
lege Board offers a validity study service to help colleges perform 
studies of the predictive validity of test scores and other informa- 
tion used in the admission and placement of students. In over 500 
colleges where females and males were studied separately, the 
median correlation of the SAT with college freshmen grade point 
average was higher for women than for men. This data is included 
in the ATP Guide that was attached to my testimony. In other 
words, the SAT has proved to be a more accurate predictor for 
women than for men. 

Much has been said recently about the so-called "under-predic- 
tion'' by the SAT of women^s college grades. The data on which 
this statement is based comes from a research report published by 
the College Board. The data show that in the particular studies 
analyzed in this report, women s actual college grades were four 
one-hundredths of a grade point higher than their predicted grade? 
using a combination of high school academic record and the SAT— 
not just the SAT alone. 

More significant, however, is the fact that the prediction equa- 
tions used in that study were based on the sexes combined. If a pre- 
diction equation based on women alone had L#een used, under- and 
overprediction is eliminated. Our Guidelines on the Uses of College 
Board Test Scores and Related Data specifically encourage colleges 
to consider separate predictions of college grades based on sex, 
race, academic program, and so forth. 

There have been recent suggestions that women are being unfair- 
ly denied adm.ission to higher education because of their SAT 
scores. The evidence is just the contrary. The increase in the 
number of women taking the SAT over the last 20 years has been 
mirrored in their college-going rate. More women seek entrance to 



ERLC 



73 



and attend college than men. For example, total enrollments in 
higher education in 1983, the latest year for which statiTtks «ip 
aval able, were 52 percent female and 48 percent malr Thi. k 
SA^InlhSytr.'"'""""^ ''""^''^ "^es thal'toIJ'thI 

reZlfh^^^n^^^'^^^^^i^J'"' ^'ff'^^"^ ™"^es that accepted 
fewer than 50 percent of their applicants. Mr. Chairman vou mav 
be mterested to know that Stanford was one of tS colleges 
where the acceptance rate was 13 percent for men ffl5 

^'^'^eptance rate for wo^en at these^77 col 

irffinTn^t^rr'/"' T'"' Nearly, women 

are finding that the doors to even the most selective colleges and 
universities are open to them. colleges and 

minis5eHnfJ°"' ^ T'^V't ^° '^'^^'^^ commitment to ad- 
mmistering fair and equitable tests. The review process statistiral 

Sv of te^ r '^^rT"^ demographics and the di^ 

versity of test takers, questions of bias and fairness will become 
even more significant and more of a challenge 

It IS a tragic fact of American life that educational opportunity is 
still not equal for all students. The educational deficS SSSed 
by many minority and disadvantaged students will Ser San 
forrJ;.n" TT'^V^'^^^y attributing different levels of pe?- 
aZlZ r educational opportunitfes 

available to women are comparable to those available to men at 

S tofat°r'n"'r' ^""'^ 't^' always encour? 

fookeri riVl \ ^^''^''^T.^J ^^^'^ opportunities cannot be over- 

Thank you, Mr. Chairman. 

[The statement of Gretchen W. Rigol, with attachments, follow:] 




74 



The Colfege Board 

1?1 7 Masj*Chu$ett$ Avenge ^W WJSi "jvn DC 2003r 
t20:)332 7134 



Testlnwny 
on the Use of 
Standardized Tests, and Race and 
6«nder Performance Differences on such Tests 



before the 

SubcoMslttee on Civil and Constitutional Rights 
Coimlttee on the Judiciary 
U.S. House of Representatives 



Gretchen W. Rigol 
Executive Director 
Access Services 
The College Board 



April 23. 1987 



■ 1 




ERIC 



75 



SUHNARy 



scores on\\^^ R , ^ ' ' f """T.'" 

tests themselves. The priSa , purpo e of ^e s5t ?s tlJ 

d.ve1oped verbal and n«?henv,t1ca'l abliu es " d lJu.U In'dMp, 

than othe?rw!'d;"^rfonLir 'n"'; -u'' "rr" - ^"^^ 

test takers froS o ? M pe^cen " 'i^ T'tl" '"f """""^ »' 

™ies'": ;^st:d^ed^•le^'?al:^"hl^r^as^" 

for woraen than men. ' "'""'"^ Predtctor 

educat1on''"i;?; ll^T.l^l l\ZlV ti" "^klng postsecondar, 

Total enrollr«nts Thl he edu on ?n e,"^,^'"^""- ""*9e then ™n 
48 percent male - the sa« P o o uSn h f „ok ?he .S^t^ 
year, at 77 of the nation's most se ect ve co??eaiJ Itl V"' 
was 34 percent for women and^3 percent for wn ' " ""P'^"" rate 

»f^"J^-!LisMji?»^l?J^;H^'^ 

.s essentia? that ^jjjits'ss ^idrto^JiiS^JSt;^!^:^ 11:^0?' [Ji;!""^ " 
s;r:d^:;a[?i'; iT^^r^r.^rT" i"' t^: ream, 



ERIC 



I 



76 



Mr. Cha)rffcan. ay name U Gretchen Wyckoff R^ol. and I am txecutWe 
Director for Access Services at the Lollege Board, a position I have held 
for six years. Ky division Is responsible for administering and directing 
the Admissions Testing Program, which Includes the Scholastic Aptitude 
Test (SAT), and the Achievement Tests, as well as the Prellralnary 
Scholastic Aptitude Test/National Herit Scholarship Qualifying Test 
(PSAT/NHSOT) and other related services In my current capacity. I work 
with representatives from College Board member institutions and other 
users of our services to review policies and procedures relating to these 
programs. Prior to Joining the College Board I was Director of Admissions 
at Pratt Institute and also served as an admissions officer at Mount 
Holyoke and Goucher Colleges. 

Founded in 1900. the College 3oard Is a national, non-profit 
association of more than 2.500 colleges and universities, secondary 
schools, school syste«s. education associations, and agencies. One of the 
purposes of the College Board Is to assist students who are making the 
trarsUlon from high school to college through guidance and admissions 
programs and to provide them and the Institutions to which they are 
applying with placement, credit by examination, and financial aid 
services A description of the full range of our programs Is attached to 
this testimony. 



ERIC 



80 



77 



-2- 

One of the original purposes of the College Board was to provide a 
series of comon entrance examinations that would be available to students 
from an parts of the country, not Just those who attended a few 
well-known preparatory schools. Those first 'College Boards- represented 
a najor step toward making hloher education accessible to all students--a 
goal that continues to be of paramount Importance to the College Board and 
Us member Institutions. 

college admissions has changed in many ways since the beginning of 
this century, and College Board tests have played a role In opening 
college doors for large numbers of students. The Admissions Testing 
Program continues to enaole colleges and universities to Identify talented 
students from vastly diverse backgrounds ond recruit those with academic 
potential. (Another example of the College Board's con»n1tment to 
promoting access to higher education Is the College Scholarship Service. 
In the 1950- s the College Board membership responded to the need for a 
more equitable distribution of financial aid by pioneering procedures for 
awarding such aid according to financial need, a move that also Increased 
educational opportunities for the less affluent and raised the level of 
participation In postsecondary education of minority students ) 

Today, the College Board's most wloely used test Is the Scholastic 
Aptitude Test (SAT), which Is taken by more than one and one-half minion 
conege-bound students every year. A three-hour, multiple-choice test, 
the SAT measures developed verbal and mathematical reasoning abilities 
necessary to pursue college-level work successfully u provides a cornron 
yardstick to help admissions officers understand an applicant's academic 
readiness for col lege- level work as they review transcripts from students 



ERLC 



78 



Who have taken different courses In the more than 25,000 secondary schools 
throughout thU country 

As a for,Tier admissions officer. I can assure you that it is not always 
easy to Interpret what the actual content of a course might have been, let 
alone what the grading practices are In a particular school. We all know 
that an "A" frora a certain teacher In one course might be quite different 
from an "A" In a different course or from a different teacher In 
addition, some schools provide additional weight to certain honors or 
advanced level courses, while others do not And at many colleges nearly 
all of the applicants have very high grade-point averages, making It even 
more difficult to differentiate among applicants 

Although grade Inflation appears to have slowed down during the past 
few years, the average high school grade-point average for the Class of 
1985 was still slightly higher than a B average (3 03 on a 4 0 scale) 
Therefore, results from the SAT or other national standardized tests, 
given under similar conditions to all students, provide admissions 
officers with an objective context from which to view other Information 
they have about their applicants. 

It Is Important to remember, however, that the SAT Is only one of the 
factorr cons idered by colleges In making admissions decisions Despite the 
limitations of Information about a student's secondary school background 
(such as grades, courses-taken and class rank), the high school record Is 
still given more weight than any other criteria by most colleges and 
universities In making admissions decisions 

But, SAT or any other test scores have limitations too for one 
thing, they are not precise measures The current score reports sent to 




ERIC 



79 



students, as to the colleges th.y designate, show how scores 

Should be Vewed a- ranges around the numerical scores that are also 
reported.There are also many other qualities that colleges n«y value and 
that sight be Important to successful performance m college that the SAT 
does not measure for example, the SAT does not ref.ect special ta'ents 
or leadership qualities nor can "t predict the academic ™tlvatlon or self 
discipline a student may bring to the collegiate environment. It cannot 
predict every type of performance nor measure every Ond of background 
that .,y be Of interest to a college. 8„t SAT scores ,:o provide one more 
Piece C useful Information help both a college and a student assess 
h0Ww.il that student might uo at that particular In Mtu.lon. 
particularly when conslde.ed In the context of other Levant lnfor««tlon 
about the test taker and the Institutional environment 

Representatives of College Board member Institutions who serve on 
various advisory Councils have developed e series of Guidelines n„ 
uses Of colleg e Board Test Scores and Belated n.t. which enu^rate the 
proper uses of tests and highlight practices deemed Inappropriate These 
mmm. Which are Included In the ATP Guide for H. nh^^;;,;^^^ 
C«li?ais. are widely distributed to schools and colleges Test scores. 
accor«ln, to the>. Gulden^ii. should be used as "supplemental -o the ' 
secondary school record and other lnfor«-^tlon .iout applicants In 
assessing their ability to undertake college-level studies, recognising 
that a combination of predictions Is almost alway. better than a single 
prediction • To further encourage the proper use of test scores, the 
college Board sponsors professional • .Inln, for school counselors and 



ERIC 



80 



college admissions offUers and disseminates a variety of p blUatlons and 
audio-visual moterlals 

Mr Chalrrwn. I welcome the opportunity you have provided to address 
the conplex and complicated issue of fairness In testing ana differences 
In scores araong groups of test takers As you have requested, my 
testimony today will focus primarily on score differences between men and 
worcen on the SAT and, secondarily, on racial and ethnic differences 

SAT scores are reported separately on a scale of 200 to 800 for both 
the verbal and mathematical sections of the test. To help put the 
discussion that follows In context. I should mention that the overall 
average SAT-verbal score for the Class of 1986 was 431 and the overall 
average SAT-math score was 47S 

Average scores of various groups talcing the SAT hav? been published 
for many years Twenty years ago, the average SAT scores for women were 
slightly higher than average scores for nen on the verbal sections of the 
SAT (ranging from 2 to 7 points), but even then, women's average 
mathematical scores were considerably lower than men's average scores 
(ranging from 41 to 47 points In the late 1960"s) Although these 
differences were noted an^ were well-known to educators, I do not believe 
any definitive reasons were discovered to explain why, during that time, 
women performed slightly better than men on the verbal seeilon and scored 
considerably lower on the mathematical section of the SAT. 

Although the gap between men's and women's math scores have remained 
about 40 or SO points for the past two decades, there has »)een a ""-adual 
change in the relative performance of tnen and woiwn on the verbal section 
of the test during this period The first time women's average verbal 



ERIC 



81 



scores fell below the average verbal scores of men was )n 1972. The 
differential In that year was 2 points and for the next several years the 
difference hovered between 3 and 6 points Then in 1978. the difference 
became 8 points and In 1981 It became 12 points Although there have 
been slight fluctuations during the past six years, the differences have 
remained between 10 and 13 points The optimist ^and perhaps the 
feminist) )n me would like to suggest that the past three years that have 
seen the score differential move from 13 to 12 and last year to 11 points 
Is perhaps a trend that will reverse these differences, but perhaps 
th.t's merely wishful thinking Nonetheless. I think It Is Important to 
remember that the total 61 point score dlf .rent^al includes 50 points on 
the -lath section that has been evident for at least two decades. 

The iTOSt comprehensive reports about SAT takers. Including 
Information by sex and by rac al/ethnlc group, are a series called 
Profnes . Colleg e- Bound Seniors The most recent report in this series 
descr1b-»s the high-school graduates of 198S. and the data provided below 
are taken from that publication The numbers In parentheses Indicate the 
difference between the average scores for men and women of each 
racial/ethnic group 



Table 1 1985 SAT Scores by Sex and Ethnic Group 



American Indian 
BlJck 

Mexican American 
Asian American 
Puerto Rican 
White 
Other 

Total Respondents 



---SAT verbal 
Hales Females 



401 


384 


(17) 


3S4 


341 


(13) 


393 


373 


(20) 


406 


401 


( 5) 


385 


363 


,(22) 


454 


444 


(10) 


398 


384 


(14) 


437 


425 


(12) 



-SAT Mathematical- 
Males Females 

454 406 (48) 

394 364 (30) 

452 402 (50) 

S40 496 (44) 

435 381 (54) 

515 468 (47) 

478 419 (59) 

499 452 (4?) 



82 



-7- 

Hhen the average scores of males and females from the different 
raclal/ethnU groups are reviewed, U \s clear that fwle/female 
differences are not Constant across all groups On the verbal sections. 
Astan American men and women show the smallest differences, while 
thelargest differences are evident between rren and worsen with Hispanic 
backgrounds When looking at average SAT-i.vith scores. Black woxen have 
the smallest difference when compared with Black men. with larger score 
differences apparent for all other groups. These data Illustrate the 
complexity of the Issue. 

I should emphasize that these scores are group averages, and as such 
they do not reveal the different abilities of Individuals within those 
groups. Distributions of scores reveal that the Individuals within groups 
(whether that group be ba>ed on sex or radal/ethnlc group) display the 
full range of developed abilities, from highest to lowest 

The SAT Is not the only test th<*t shows score differences among the 
different groups taking the test, particularly dlf feren».es between male 
and female scores The same trend exists for American College Testing 
(ACT) >'rogram scores. The ACT Includes separate scores In four areas 
Cngllsh Usage, Mathematics Usage, Social Science Reading, and Natural 
Science Reading and Is scored on a scale from 1 to 36 Between 1970 and 
1984. the advantage of women on the English score declined fro,n an average 
of 1 8 ACT score points to 1 1 . Similarly, the advantage of men on the 
other three ACT tests and on the ACT composite grew over the same period 
of time. For exaffiple. from 1970 to 1984 on the Social Stuoles Reading 
subscores. the advantage of males climbed from 1 3 to 1 6 and on the 
Natural Sdenre Reading from 1 6 to 2 S. Data from the National 



83 



Assessment of Educational Progress (NA£P) and other standardized tests 
show slffiHar trends, suggesting a deterioration of women's average scores 
In relation to average scores of men 

Average score differences are of g-cat Interest, but I would Mke to 
state now that, bdsed on the best available data, we dc not believe these 
differences are caused by bias In the tests themselves We are pleased 
that this SubcorrmUtee has provided an open forum to discuss and examine 
the Issues We InvUe the members of the Sut otrmlttee to ponder with us 
' and other educators the dllerrw of trying to explain differential 
performance and changes over time We would be less than honest If we 
Claimed to have all the answers. We car offer some hypotheses, but we 
continue to question research and our conclusions. In, many ways, this 
hearing and the ongoing Investigations about dlfferencps In score 
performances are similar to t.ie work undertaken In the fnld-1970s to help 
educators understand the overall score decline that began during the late 
1960s, Just as. In 1977, the Advisory Panel on the SAT Score Decline, 
headed by former Secretary of Labor Wlllard Wlrtz,, rejected the notion of 
any one single cause for the decline In SAT scores. I suspect that there 
are probobly numerous factors Involved In this Inquiry. 

*;jiy, then, do some groups of students score lower than others' In 
addressing this Issue, we must look beyond the test scores to the 
educational experiences and backgrounds of the test takers It Is a 
tragic fact of American education that educational opportunity Is still 
not equal for all students The eduratlonal deficit experienced by nvjny 
minority and disadvantaged students will neither disappear nor be overcome 
simply by attributing different levels of performance on tests to bias 



ERIC 



84 

-9- 

Although the educational opportunities available to women are comparable 
to thet available to reen In similar school settings, the fact that women 
are not always encouraged to take full advantage of these opportunities 
cannot be overlooked Tests help reveal differences and It is essential 
that we work together to t*-/ to ellmlnaie the cause of these differences, 
rather than blajne the messenger for bringing the reality o* this 
educational deficit to our attention 

What are some of the possible reasons for these score differences? 
One Is that the number of worren taking the SAT Increased significantly 
In the early 1970s, Just as the nurr.ber of men decreased The growth In 
the numbers and proportions of women SAT takers froir only 44'/i In 1964 to 
50% In 1975 and 52X since 1981 corresponds exactly to the time periods In 
Which the scores of worren declined This past year, ihere were about 
40.000 nvore women than men who took the SAT. When dealing with average 
scores on tests that are taken by spU -selected populations, rather than 
balanced samples of students at all ability levels. U is usual for higher 
proportions of test takers to result In lower test scores For example. 
If only the top lOX of a group of students takes a test, the average 
scores for this group would probably be higher than a much larger group o^ 
students who represent a wider range of abilities 

Me believe that the Increase In the number of women taking the SAT -- 
presurnably because more women are considering a college education -- 
Should be regarded positively It Is Indicative of changing mores and 
social patterns that have heightened women's expectations about their 
educational options and their careers 



ERIC 



85 



-10- 

Another reason for SAT score dUferences between men and women U also 

related to shifts In population characteristics of students taking the 

test. As the table below indicates, the percentage of wOf^en fron the 

various radal/ethnlc groups Is not equal This data Is also taken from 

Profiles. Colleqe-BounO Seniors. 

Table 2. 1985 CoMege-Bound Seniors. Nutr^jer 
of Students Ethnic Group and Sex 



Total Percent Total Nuaber Percent 

Number of Total f.rnale female 

American Indian 4.642 0-5 2,563 S5 2 

Black 79.556 8.9 47.866 60 2 

Hexlcan Anwrlcan 13.526 2.2 10.395 53 2 

Asian Araerlcan 42.637 4.8 20.959 49.2 

Puerto Rlcan 11.077 1 2 6.000 54 2 

White 715.773 80 0 373.694 52.2 

Other 21.555 2.4 10.839 5C 3 

Total Respondents 894.766 ,00 0 472.316 52.8 



The larger numbers of female test takers have, n recent years. Included 
more women from racial and ethnic minorities, for example, of the nearly 
80.000 Black students who took the SAT In 1985, 60X were women Women 
represent 55X of the Aaerlcan Indians who took the test and the percentage 
of wOfnen In the Puerto Rlcan and Mexican American groups were 54X and 53X 
respectively As I have noted earlier It Is well recognized that the 
educational opportunities available to many of these minority students are 
not the same as those available to white students. Indeed, within these 
minority groups, females are often at a further disadvantage 

We also know that, as a group, the women taking the SAT were less 
likely to have followed an academic or college preparatory program In high 
school than men and that, on the average, they took fewer years of study 
In academic subjects than males Parents of the females taking the SAT 



86 



had snghtly )ess formal education than fMles and the females tended to 
come from fatinies with lower rwdlan Incwws. Although females do come 
from farallles of all educational and economic backgrounds and many have 
taken rigorous academic programs, as a group they are not quUe as well 
prepared nor are they from homes as advantaged as the sroalier nureber of 
male test takers. Although I an gratified that Dore women from 
backgrounds that traditionally have not considered college are pursuing 
higher education, these data also raise concerns that such low proportions 
of minority rwles are considering college. 

The courses women take In high school also are a factor In explaining 
some of the score differentials of raale and female ^^udents. For example, 
the more mathematics women study In high school, the better they do on the 
SAT math section. The fact that wonsen take fewer mathematics courses., on 
average, than men probably explains a large part of the difference In the 
SAT-math scores, women also take fewer courses In the physical sciences. 

I am sure that you share my concern that many young women are not 
encouraged to take more oath and science courses and that so few consider 
sdentlflcally-orlented career paths It Is difficult to know exactly how 
much of the Sfor(> difference 1n math u rel;tfpH tn these upfortunatp 
social Influences, but I personally am convinced there Is no Inhp^gnt 
difference between men and vonen which preclude them from excelling In the 
areas of fiwthematlcs and sciences. Tests continue to remind us of an 
unfinished social agenda, and SAT scores reflec: educational reality, 
rather than educational ideals. 

Although the difference In average verbal scores Is not as gt t as 
the difference In math, t Is more difficult to suggest explanations for 



ERIC 



90 



87 



-12- 



the n-polnt difference In SAT verbal scores between rten and women Some 
of the difference is probably related to population differences described 
earlier. We have examined the test specifications and many of the Items 
on previous editions of the test and have found no systematic explanations 
for the olfference. There are questions on which wofwn do less well than 
Mn and there are also those on which women do better, usually such items 
arencutral In content and do not suggest any plausible reasons for the 
differential performance. There have been numerous changes In test 
content over the years;, however, none of these changes coincide with the 
tltnes when there were ;lgnlflcant shifts m scores. 

During the past few weeks there have been allegations .».at the test Is 
constructed to Intentionally produce scor-s at different levels for 
various subgroups. I would like to state categorlcMly that this Is 
absolutely untrue The College Board Is cow^ltted to administering fair, 
effective and equitable te^ts. Our numbers would accept no less As a 
result of substantial research efforts, the Board believes that the SAT 
reflects accurately the developed verbal and mathematical abilities of the 
individuals who take It. regardless of their sex or racial or ethnic 
background. 

There are three basic methods used by the College Board to detect any 
potential biases: reviews by numerous co^rinlttees and panels, statistical 
analysis and validity studies. It Is significant to note that these 
efforts date back to the 1920's - the very oarly days of the SAT. 
•Precautionary studies' of the test performance of males and females, for 
example, were conducted fron the beginning, reflecting the Board's strong 
concern in this area. 



ERIC 



88 



.13- 



Since the Ute 1940s, the SAT and rwst other College Board tests have 
been developed by the Educational Testing Service (ETS) £TS shares the 
College Board's cofmiltinent to offer tests that are not Influenced by 
extraneous cultural, ethnic or social factors ano toward this end employs 
nutnerous procedures to ensure the tests are free from any such Influences. 

Current practices require that each new College Board test undergo a 
sensitivity review to Identify and eliminate ambiguity or 
potentlallyoffenslve material based on race, sex., and cultural 
background Sensitivity reviewers are trained to ensure thorough 
knowledge of the review process and consistent application of review 
criteria. They are selected on the basis of their ability to perceive 
potentially of fenslve material, to eview tests from multiple 
perspectives, not slrsply froQ the viewpoint of one group or 
social/political philosophy, and to cover key subject areas such as 
humanities and social sciences Ouring the past year, sensitivity 
reviewers of new editions of the SAT Included H women, four Blacks, two 
Asian Anierlcans. and two Hispanlcs 

m addition to these formal sensitivity reviews, each College Board 
test is thoroughly reviewed by the high school and college faculty serving 
on the SAT Connlttee. as well as external review panels for both the 
verbal and mathematical sections All committees and special review 
panels are sele ted f-'om among a cross section of backgrounds. Including 
minority r^pre^-^r. ^tlon and women, academic disciplines. Institutional 
affiliations and geographical representation 

Since the late 1970s, each new form of the SAl also Includes at least 
one passage dealing with minority Issues New tests are also reviewed 



ERIC 



(\0 



89 



-14- 

carefully to ensure that an appropriate variety of references to women and 
ninorltles are included throughout the test These content specifications 
are most apparent In the reading comprehension passages and in the 
sentence completion items, which have more text than the analogy, antonym 
and fflathematks questions. 

Finally, it should be noted that women have been Involved In all 
aspects of the development of the SAT for several decades Since 1973. a 
woman has been the primary test development specialist for th*' verbal 
test Of the ETS staff members who spend significant amounts of 
tlmeworklng on the SAT. there are 11 women and 7 men working on the verbal 
sections anO 6 women and 3 men working on the ffNjthematlcs sections There 
currently are IS women and 10 men who serve as outside Item writers for 
SAT-Verbal. 9 women and 9 men outside of iJS who write faathematlcs 
questions 

Statistical methods that consider the performance of groups on 
Individual questions or clusters of questions are also used to ensure test 
fairness and to detect any possible bias We make available the raw data 
to the research comrounlty In an ongoing effort to ascertain the specific 
causes of differential performance on tests For example, a Public Use 
Sample data tape, containing all of the Information about test candidates 
from the largest administration of the SAT each year. Is offered as a 
regular service The College Board welcomes external research on the Issue 
and Invites the public to analyze or reanalyze the data 

The College Board has conducted numerous studies to examine how 
different groups perform on various SAT Items During the past few years. 
Item fairness studies have been conducted on the basis of sex. ethnicity. 



ERIC 



90 



-15- 

educatlonal background of parents, and level of tngllsh profKlency An 
article published by the College Bocrd In 198J. "The SAT in a Diverse 
Society fairness a^.d SensUlvUy," suoMrlzes the kinds or analyses that 
result from these studies. The purpose of these statistical studies is to 
roonltor differential performance In order to (]) ensure that the SAT 
renalns appropriate over time for major subgroups of the candidate 
population, and (2) Identify possible content factors related to 
differential performance, if the analysis identifies any questions with 
large differentials, further analysis to identify the causes \% conducted. 

£TS has recently developed a new statistical procedure that holds 
promise in further detecting any potential bias In our tests. Known as 
^differential item functioning" (OIF), this statistical procedure munches 
people of the same ability level before comparing their performance on 
test questions. The assumption Is that Individuals of similar knowledge 
and Skill should have Imllar chances of answering a question correctly 
without regard for their race, sex or ethnic background. The statistics, 
thus, corspare tne performances of majority and minority students, and men 
and women of similar ability. Research on OIF Is continuing and plans are 
being developed for Its use at various stages In the test development 
process 

The third i>elhod for determining If 3 test Is fair examines whether it 
predicts equally well for fllfferent groups of audents Predictive 
validity Is the measure of a test's effectiveness In predicting tne 
academic performance of a student In college The College Board offers a 
validity Study Service. wUhout cost, to colleges that wish to evaluate 
how well their admissions data predict the academic perforftvjn.e of thoir 



ERIC 



91 



-lb- 



enrolled students. The service provides assistance In performing studies 
of the predictive validity of high school records, test scores and other 
Inforrvjtlon used In the admission and placement of students. 

Over 1.300 of these validity studies have been conducted by colleges 
and universities In the past several years to determine whether the test 
scores predict the trxpected outcome In tne freshmen ytar These studies 
also help indicate the relative weight that should be given to SAT scores 
and other data (such as high school grade-point average or high school 
rank) In the admissions process 

These validity studies Indicate that In over 500 colleges where 
females and males were studied separately, the median correlation of the 
SAl with college freshman grade point average was higher for women than 
for men as Indicated In Table 16 of the ATP fiutrie Thus, the SAT has 
proved to be a fW)re accurate predictor for women tnin men 

Huch has been said recently about the so-called 'underpredlctlon' by 
the SAT of women's college grades The data on which this statement Is 
based comes frorr, a resear-h report authored by Mary jo Clark and Jenlee 
Grandy and published by the College Board. These data show that In the 
particular studies analyzed, women's actual college grades were four 
one-hundredths of a grade point higher than their predicted grades using a 
combination of high school academic record and the SAT. not the SAT alone 
More signlUcant is the fact that the prediction equations used In that 
study were based on the sexes combined However, as noted above. If a 
prediction equation based on women alone was used, under and 
'^ver-predktlon would be eliminated Our Guidelines o n thP_iKP^ 
Colie30oaj:lJest_Sco,re^ specifically encourage colleges 



ERIC 



92 



-17- 

to cons1<Jer separate predictions of college grades based on gender, race, 
and ethnUlty. 

Data about the validity of the SAT for various racial groups have also 
been studied by both the College Board and others AUer two years of 
Intensive examination by experts, the 19B2 National Academy of Sciences 
study. Admissions Testing In Higher Education , concluded "that predictions 
made from test scores are as accurate for black applicants as for f^Jorlty 
applicants, there Is only scanty evidence available for other minority 
groups. Subgroup differences In average ability test scores appear to 
mirror lUe differences In academic perfortnance as wasured by course 
grades, in this sense, the tests are not biased." 

Before concluding, I would like to address briefly an Issue that has 
received much attention recently the Empire State Scholarship {New York 
State) awatds and the disparate number of male recipients You may be 
aware that these scholarships are awarded on the basis of SAT or ACT 
scores The College Board does not support the use of SAT scores as the 
sole criterion In any decision making process even In admissions for 
which the test Is designed But the Issue here Is one cf social valuer 
and whether the Intent of the scholarship program Is to recognWe 
scholastU ability (or any other factor) regardless of the composition of 
population competing for such awards or whether such awards should be 
apportioned arwng various subgroups. The designation of awards by 
subpopulatlofis Is frequently a major part of many scholarship programs 
This might be allocating a certain nuTber of awards for congressional 
districts, a certain percentage for men and women or for various 




93 



.18- 



r*cl4l/ethn1c groups. Cleirly. no Jingle test wUl automatically result 
In such desired 4lloc4t1vnJ. 

There h4ve also been recent suggestions that women are being unfairly 
denied admission to higher education because of their SAT scores The 
evidence Is Just the contrary. The Increase \n the number of women taking 
the SAT Over the last twenty years has been mirrored In their college 
going rate. Here women seek entrance to. and attend, college than imn 
For exatple, total enrollments in higher education in 1983. the latest 
year for wtilch statistics are available, were bZ% female and 48X male. 
This Is Identical to the proportions of females and i^^les that took the 
SAT In that year. Data provided by colleges for the Annual Survey of 
Colleges, wtilch forms the basis for the Colleoe Handbook , includes 
acceptance rate information separately for men and women for 77 different 
colleges that accepted fewer than 50 percent of their appllcjmts to their 
fall 1985 class. Overall, the acceptance rate for womei was 34X. for m,«o 
U was 33%. Clearly, women are finding that the doors to colleges and 
universities are open to them. 

In conclusion I would like to reiterate cur coonltnent to 
administering fair and equitable tests. The review process, statistical 
approaches, and validity studies I descr'.bed earlier are continually 
examined, refined, questioned and analyzed. Ulth changing demographics 
and the diversity of test takers, questions of bias and fairness will 
become even more significant and more of a challenge. This week th* 
American Educational Research Association Is holding Its annual meeting 
here In Washington. A look at Us program Is Illustrative o' ^he 

^uportance researchers and test developers place on questions of fairness 
In testing and the analytical methods to achieve that goal. 

Thank you. «r. Chairman. This concludes my prepared statement. I 
wl^l be happy to answer any questions you might have. 





94 

1986-87 



ATP GUIDE 

F=or Hjgh SchKDOIs and Colleges 

SAT and 
ACHIEVEMENT TESTS 





















■ 









IE3 THE COLLEGE BOARD 



0 0 



95 



^° ^^'^ « Question about Servtccs for Students 



PHONE V>TXUyM»r,U, 



Contact Ybur CoKcse Board Rcs«onal 0«'C€ (back cove-) 
for Information about 

• s*Y)f(>fepO'l tCfTH" ir»iio'i,itc li,X' uu-is c>'r » 



Publicalionj cited m this Gu\de can be ordered from 

Coio<)t'6o.r(J ATP 
CN 6? I? 



/* „ „ .- vruss-on of ATP Jes's ■'. tn ^ rj v» i n s G.j-<V sto ^rtp 

^fs ano ir.c t.vo «l!on«i of A , SA^ k^, joj 7,^ft^,cd' 0,i-j -ro Sc^o.>Jvc Apt 

lr« ou<Je to me ATP aryj reu-oo «.r, ce. 5 t-'c^vm -o- h vc ncx^s cno 
by Educas-ortaJ Toy n,i So^vcc cn iv.^ons i-a ^ !» - < >. -e^is o' 'v A?P fo- 
^C«!cge 8avd Up ;o H^. ^v!or.v .opes vvr^oo v. ^ ^n<i 
coflcoes fteo upon rcquos* Cooos n doi'M.' «y-, of -'-cc -1 v. ' v» ci c'(j,-f<i i' 
»i 00 pef copy 

The Aar^^Kons Tcs- nq Progfam ^ a p^o^M'^ of ■rx« Co(«>^ Boi'd 3 r,onprofl 

fWT^Snp Of9Jf1./A'on ria- prov(JCS {^^SlS ^-^a O-nC OCJuC-VOni ^^O'V-CCS lOr Mil 

dWTW 5CN)ots ftfid co*i09*s The fTw&eryi p s cotpomcj 0' m-c f^^n ? 50O cC 
ieges schools school sy^'fm^ jnd f^a.JC^•) on a^vx R,.jvc»^nM' ^cv of tr^o 
members w>p on 'he Paj'J o< Trus-w^ vs^y co^n, ,s rorr^rew tf,.r 

coftVOc•t^ep'oq'^'r^so*■^(-Co^'OG^B..^'dd^<3^^l, ...v n A^v.rT,na«on o' ts 
POICCS aryj OCW tcs 



n?"T COPY AVAlLAbu CS 

ERIC . 

a; 



96 



The Admissions 

Testing Program 

Tt>. ( ' Bair<J Atvniviions Te^.fxj Pfcxjf jm (ATf ) 
1 >«\] to >> -^l -ilu^JcntS hiQfi sctHX)!:, Cx9n\jty^ ("1 v>n 

I h innf o' tO/T^niuD'CaJion Ijcfwron <iu<lrHi*s ,if\i tr>es<» in 
•«tit.)o'i'- B<v:iuso »)C s»i{)jcct mnncf of Unp icftoot 
I o<i'v« Afi' .ts (} .Kl nq sl.in(),vtt, ^^iry wxJc^ tt>o ATP 
te^ti t^.u* wen d<?v olopcxl lo pfovxi<' ,i < ojwnoo st.wxJar (j 
i(J t nst s.h CI <;lij<k'"t' can tx» cwnp.-i»0() 

Tho ATP corv'/s oJ :tio ScJ»o«,)sl«c Aj)<iurtc Tcs> (SAI) 
t^o Tovt 0' S'.^nfJa'O ^,tton f nql»s»> (TSWF) fho Ach-rve 
r>ont V^s .i-xj t^H' SJuOeoj t)csa»j,tve 0>esJ<xuvj-fe 
(SDQi Comely reijiwJ to Wto AIP ,ve tr>e SiixJcm Scafcti 
SCfv CO t^o S.inirTwify Rcpoitng Sc»vx:e JfX) Vj'<Uy 
Sfjtly So'vice 

cnced fi of" scnooi .ind co«ogc !<»chcf s mk) so< s{>oc''ca 
; on<; ji(J f inc conjcot Mc<cu»oos care goes nto WM? 
.vr!r>q rfc'estriQ rcseafCti arxJ evatuoion oJ cocn o< 

{*^'OijC)'iouI tr»o aamintstcftnq sconoQ .mO reporting 
P>'>^<;ei 0' tf^e p'txjfam There are no A)o or gr.-vlc fCSJr»c 
' CIS to» t>i" ng thp losls arxl afl oJ l^cm are avaiiaWe lo 
sfucJt'n's w tri h.^-vlCtips Mosi stixJenls taKe thr teas d»if 
ng na; vnA 3jm n stratons Ihor junor arxl son<x years 
o' secon<n'v sc^oo! Some coltcgcs have a'rawjo 
ments 'or test rnj MLJ<)cnls have no« tesiod {x?lo(C .ip 
pJv.fq <Sco Test ng on Campus pa<)o 8 ) 



The Scholastic Aptitude Test 

The SAT 'Sa2V/ hour mortiptociKJCCtcsntial measures 
iJovdopej verba' and mathematics reason^ aWtes re 
!a'Ci1 to ^u:ccs9'ui performance m coacflc It rs 'nlco<)e(J lo 
siipp'or^ 'I t^o MXondary school rocwri arxl u'hc* mfof 
rat cn at>oul ino student m assessing fea<l ncss 'o' 

CCiOCje'O.C' AOfk 



The Test of Standard Written Enslish 

Tr^o TSWF a 30 minute fiKiitipIo <i>oce test adrnmis 
to'Cd vs !^ the SAT The TSWE measiircs students aW-ty to 
tocogn . c and use sla-xJard wriltcn E rtgUsli Scores can tx} 
used b> co^'oges lo hc*p place students m .»p{>ro()riato 
fresh-mii English cou'scs 



The Achievement Tests 

Acfi evetTx>rn Tests are dcs^qncd to measure kncM^ai^jc 
and the ab'i ty to apply thai knowledge m spcC'fc subject 
areas Ac'ie.crront Tests are mdefcndent <^ jwUcuiar 
tcvtbooi^s or methods of instruction AlUiough types of 
Qaosto^s cha'ige ttte from year to year the tests do 
CkoIvC lo rc"ect Genera' trends m h<jh schoot currCu'ums 



'n«'<'t«).'«\j('jH-^>tjrri<jAitn«.i« 'hv » To !su'-yi tNfi r> 
4^<<ctn*j *4ijiV^Us 'i>« <dri\is\un < Os>fM" h'C'^o'iJ o- 
t»i>:'i SoTTM* ^o<^^'•. s'"^'* f'c U> t« 'fei or tf^e 

Mjt),Ot I t'<>'S |"0A ,»)plr 1' I- tn f 

Ach<>vofiy'nt I«>*.K inMjv«fi.n I k-j' yi Cotr.iK)st o> L ! 
<>«,i?iiro ArTV>f»* V1 M story j ><l So( ,t< St.dy I uroiXJa'- 
HiStofy and Wotkl CHitui'->s M itticm il^c^ Le^^ol I M.V^e 
maiTS LcsO' 'I f'OiXh Go'nuii Utm St«nsh 

BoiotJy Ch4»(n '■4tv ami f'hys>kS A'l ati' or>o '"Ui niu'tiVo 
l^ohO teits Aith iho ex et-ton o' i^>e LMrtf o^f i.c^.ton o' 
ttJO Ijkj'sTi Conipo^t-on TeM wh^fi ^ cot tx>^o<i o' 40 
m-nuies ol miitjuo ( t*0iv0 cjik'^-'o'is i u o rf» >?0 rnnn'e 
essay assKjnnx'nt 



The student Descriptive Questionnaire 

Tho SOO lAtiiCh IS ansiAOrtxJ by .itxxit 90 f>C'cent o» the 
Storlenis when thc^y retjstCf for tf'c SAT or Ache^eiTCr' 
Test"; contains quest <)r>b jtKXit tt'e 'tiKJen's bac^g'ound 
high school co«it'^ and oitic' e<<iiCat'Oni' and c»tr3Cur 
rCul<Jr pKjxyerXxs and p<ans IJ» coi'egc study 

The SDO conir Ujft^s to gu (ti xe and a<lm sso'k t)y en 
aWng (nd'\(KJik\' sttKlotMs to ptc^m a txt\Kjor pctu'e o' 
tr>enr<Hves to cd eges itM'i is conveycti by test scores 
a*one Students answers ;o tr<e SOO a'xl ' scores ot^ 
the Jests are the primary '/><jtcos 0' (he taUes n the ATP 
Summ^vy Me;xxts it<c "^^.jO is a'<^ one o* the so-„fccs c* 
intormation u^si in \hc S'udcnt So.i'ct> Sorvco 10 "dentfy 
Students w th spec'f-c ct'a'actefist'cs <j<«;>gna'cd t)v pa'to 
pat.n(j coiSeges and scf>oiaf*^i p a<iOi<'es 

Nolt' fi<'Cause Wic SCX3 a as new m 1985 86 students 
A-ho corrpided trie S[KJ p«x to Octotx^r 1985 and who 
wsh to tiave SDO in'ormat<on rofX)»?t>d lo co'legcs must 
con>fycte a new SOO (i' ttiey rc<j stc tc tea a>i n) or sub 
m I an SOD Upctiy; form nc)u<}»M n sumnK^r bu'K sh a 
men!s lo sccorxiiry schot^s 



Score Reports to Students 

Approximatoiy tive weeks ator the tesi da»e students 
wdi receive thcr sttxlont report ca'od the Coiicge Pian 
"■rxj Rcpof ' ti two page docuirenl that irttcgrales the » test 
scores wth key mfofmaton atxxit themselves and the col 
leges they arc censKlcf ing DesiQncd to htfp students un 
derstaryl that thcr scores are or>iy part of the cofioge 
entrafvc pcture College Piano ng Reports contan stu 
dents current lest scores {expressed tx>th as numbers and 
as ranges) picvous SAT TSV^E and Achevemeni Test 
scores on record naionai arxl aa'e {Hjrcent tc ran^s sec 
onrtvy school courses and grades and rfiformatcn about 
the co'legcs Ihc stixJenls des-gtvi^ed to rece've score re- 
ports (See pages 8 to 1 1 for a complete descr pi on of the 
Cofeoe Piann ng Report and a san^pic report ) 



Score Reports to Hish Schools 

Atxxjl live weeks atcr the test da'e hgn schools re 
cove at rx) cost a Co^'ecje Counsel ng Repod for each 
student wtio g3iC tt»a! hgn schoo' toiJo number when 

3 



IGO 



97 



tC»jsUvf>^j ',v AlP ^^M. I., KM ton • 1 fr>^ ' j^iof,-. w ! 

VlW.'S -^-CVO '<» Vjos VhJ O'lt O 'f-t »0{., ^.0.1' , fi 

f ,jnj CO' tH;o ,v ns \i , o i ?Otj ^ , 

■0<,v»S <V;(J'MV\3 'fV ^'i\V( ( to -tv ^^,„ 
St« tH\l 1 on {H,J.^ ^ nHi , „j , p,, 

in.jO'Hon Ml/ xf^OC- s r<\ ,> v<' 

•OJH f'lHJ 1^ tAt-- cJ ts N v«st 1 Id vtl .1 in n sir jtO'v> 
I'o J so no'CJ 

• Score fOStcr-^ <,)p*i.i{'tlc>7 ! <if<; o' «U<^l<'nts tjv a^^^V 
'evd conljnnQ curfcnl scores <;e» .incj c'^nc 
')'Ot/p) t'ia« a'O vnj to scnoo-s TVr cki .U-nosiij 
fon Srudcrns Aho Ario ii>scp! o' Ahose sa->rrs .ro 
(JcMyovJ 'Ot anv rc isrn vvi U) .A'n.'o^l o.-ly on ro 
VO'O roller Cu<n>'.^!vO rollers vvi' tv,' vnt !o srhoo-S 

Hgh scfwis <ind M-f'OOl ds.f^ts jbO m.u pu>oH.i<;o 
ma-jnett t.jpos fortl.i.n nq 0 jch <;rudcn!s SAT ,nj 
Ac^ev^>n^?^I JcJ sco'cs Troso reiH)rts ,vo ^crl nTt^ 
ONCy tCbt .wm n ;|f on id tv'udo scores from ptc\ ous 
tc<.t cJd'es ^is AC • .js SDO respon^A^ 



Score Reports to Colleges 

Abool 'our AOO.<s alcr f Kfi 'est cn'e coik»qcs rocc'vc 
reports (Of a I StuJer^ts who ncJcaifXl fvV the r tes' SCO'CS 
tH> sent 10 .t Coi'ctji^ Tiay request reporfs n ofH' 0' tt>o 
foiow.ng 'o^rpajs A,tnoi>! tMa'ijo AddJoay formais are 
avaJaWc 'cr .1 ?oe 

• Tr^e CO'^CtW Admssons .inj Atwisnq Repor; con 
?a ns in a(j<j ton to the sTudenI S Corrent Unti prcvtOuS 

test scores and acadotic p'O'^'c .i student 
in^orma'on socton ute'u) for p)acen"e'>i .itid .kIv s fo 
Purposes (Sec paijc 1 2 and paoe 1 4 to 1 7 for a com 
p'ete description and ^vinnpie 'epof I ) 

• Ma(jnc;-c tape con'a ns <j>i re in'orrra-ion on [he pa 
per reports 

• Pressure sens tf/e labo's g\.e ct>rret»i j^d [vovous 
test scores reported as tAo d g ts 

• f^ressu-e sons t^ve m.i. ng I voe's conta n only tfie 
n<*fre aid address 01 the student (Ava la We on'v as a 
second option ) 

Fuft^ef .n'ormat on atxxit sco'e report 'ormats ts incf tided 
-n the txjokict me 87 ATP Sco-e (^cpon Opion^ v,hch 
coJ'eoes receve di.r ng the su-^mcr 

After each test adm-n siraton coi'eyes that received 
scopes tor or rY>oro students rece ve sumrr^ar/ stat s 
t cs on scores reported TtiCSO Summaf y St^t.st cs prov>d« 
dstnoutons means and standard de.^iaions 'or the sii7 
dents wtio requested that the r scores be sent in addt.on 
coi-egcs a'so w.*- receve a Trend Data Ropori that <ndudes 



The Student Search Service 



Tte SiPdO'^i bod>c^ So'vce a^ s-s ,o.o(ios aid gov 
crnvniv -ch ,.fc,>,p p,o.,,'.is r.^Tttyng sKideo's 
A. > (.oMan (.hvjftof 5»l-cs niso^Jon n'o'7 atOT.ne s'u 
dOn:s,)rov(jo nt.VSW O^'y fXMc<x:orei v nsttut-ons 
['''If'n^c^'^ '^"^con^^v. :o Lc nctuded 

L'V ttV U S Dtpa tn or>! o' Fdi-caton n ts Currt-n- £djc.i 
t'O'^ P't\U^', CO'CQiS .r^a Ur,,vs.t<€^ OOvernfTOnt 
scfo.jrsHp a'jences ani roitH^or^oa^ ms'iuions 
g'Oops assocvitons arvi ro'ibCM jna' arc rpo'-f>e«s o' 
■no Co...^;e Bo,vd are be 'o uso ine r. ce ins! tu 
ion> fwncpat.no n i^^o Student Sea'cK Sofvce specV 
Mufjeni cna'actc'Ucb >n Ahcn they a-e forested si,ch 
a;aradeavef,ige a ta xjec lej>! scores r^tcnoed wieqe 
maor e'f>n.c t)»c^(jf0und ro. gous pfeVr^nce and i«>o 
trapnc lor a' on Tnev can ttion roquesi t^e r.a'rx?s a'^d 
dddiesvs oi stiidenis ma;c^no if'oso ^oec'oitons In 
rfddl'on the 'oJ'OAnc) student n'ofmafo,i .s reported to 
!hom s^, fioc a; tv r.u/rtx>r t>rtn rta'e and second 
ary scrKX>! irKlvKlua' 'est scores are not feporre<J by 'he 
StuJent Sea^cfi Servce Pa't c pa'.nq mst t^^tons tnen send 
t^csf> students •n'o'mdton on trie.f proq'an s adr>vssons 
px)i<:es tnancn'adopportuntics aidi-ei-e 

Students ndcdio tr>er n^res! n tx'nq .ociuded tre 
Student Searcft Servce on the Regctraton fotw About 
80 f)ercent do so The Serv ce searcnes -ts ! ics s/x 1 rnes a 
year^ater the Decer-bet test to' students aMq tooi^ the 
bAi th>ec tmcs >o the spr ng 'or students AT.0 took the 
Hteim-nary Sct>oiasic Apttude TestNa;,on.u Met Schot 
arsh p Qua' ty ng Test (PSAT'NMSOT) the prevOus fa'l m 
the summer for luiors Aho t00'< the SAT ,n f>e prevotis 
te^ ng year and a'so m the su-nmcf for junors Aho took 
Advanced Piacemcrl Examnator.s (Gee Pocfuvng 
tnc Sti^dc-r! Search Ser\K:o ) 



ATP Summary Reports 

Each sunnmef the CoJ'ege Board produces a sef«s of 
fcpo'ls sumnanz.ng ATP test scores and data from the 
SDO ror the prev-ous year s sen-Of class who took the tests 
Summary reports are produced Jor secondary schools on 
tho.f cofiegcbound seniors and 'or cof'eqes on the sen 
■Of s who serf ATP score reports to them Scta'afe reports 
are comp led on coi'ege bound sen ors by sta'e by regon 
and 'or the naton Sec the 5c^oo/ Gwdc to ih^ ATP Sum 
m^ry Pepo'ts and the ^o/tcgc Guxie to the ATP Summay 
Pcpofts fof deta-icddescriptons 

Because a new SDO Aas introduced in 1985 86 sof^ 
students .n the class of 1986 Wi" have compfeted the ofd 
SDO and some the new SDO Consequenfy ,1 will not be 
poss We to produce a fu'l summary fcpoft lapes or co» 
lege reports on appi^ams accepted appicants enroiing 
freshmen or persi'ers for the class of 1986 only atbtevi 
a^ed reports wi't be produced for ths dass Beg nn.ng wtn 
the class o< 1987 a r«ew se'<Js o* summary reports wiJ( be 
ava iat>'e 



ERIC 



^' 98 



The Validity Study Service 

The CoS'cge Board oMcs th-s serve© w^hout cos» to col 
feces na? w*«nio ev^uatc how we« tho* admis&ons data 
p^cdc» tnef enrof.ed sJudents academe pcffocmance 
The serv<e provides assjslarco in settog up stud«s o* the 
pfe<JC\e va'<l ty of hgh schoo! feaxds ATP saxes arxJ 
o'.*'ef -n'of Ttaion used m the aonrnsson and piacemer^ of 
$ttxJcr^^s II aisodeieffTvnes the be<^ weighted ayr,binatoo 
ot M'gh schooi grades arxJ lest saxes ^o* (Jsfjnatmg siu- 
derjs fresH(Tian year academe pcffofmanco at mdvidoat 
iftstJutons (Sec Usrf^g Prcdctivc Va»<jiy Studies " page 
26 arxJ GjKfe to the Co/icgo Boird VikPty Study Servce } 

Administerins the Prosram 

Complete info<nnaion rogafd:ni) test dates reg'Straton 
procedures tees speo^ arraftgwnents arxJ scoring scf 
vces appears ^ the Rcg>5tral>on BoSetm tnftxnnaioo to 
f>e'p counsckxs answef the Qyestoris they are asKed rrost 
freoueniiy is repeated tjoiow, w>th addtorol tnsirucions 
to' spec<a' s^iuatcns 

Publications Sent to Schools 
and Colleges 

During ir>e summer high schoote roccrve a suPP*/ of the 
pubicaior^ arxJ 'o^ms t«sted below Reference copes o» 
inc potjicatons are »so sent to coBeges vAttdh may order 
addiora'cop'os 

• Reg ^tration BuHetm tv the SAT and Achievement Tests 
Co* d str txiloo to students) Conia«s the Reg«traton 
Focm SOO a l"Si of test center codes m trx) reg«o ccA 
lege aid schoiarshp codes state and county codes 
arxJ in'or mat-on on ATP procedures and sendees Thcfe 
s a We* Ybfk State Edition aryj an internatcna) Editjon 
ot the BWetin .n add ton to the fooc regoo^ cdlons 
(M</^;err> Noftnea^tcrn. Southern and Western) 

• Corrp'ete List ot Teit Cen.'ers fof fhe SAT and Ach^ve 
meat resfs (for re'er encc) Conta>r>s an of the test center 
codes if^ a!» ed tons cA the BuOelin 

• Codes for SAT and Achievement Test Saye Reop>ents 
(for re'ereiee) Conta-ns a compfete list of codes tor c<^ 
leges and schoiarshp programs Upward Bound pro 
gra^s and memtiers of tt^e U S Senate and House of 
Representatives 

• Addiona' Regstraion forms and envctopcs (for stu 
dents wMo rcg ster for add-Kxi^ test dates 

• Takinq ihe SAT ('or dstnbciion to students vvtw tntend to 
reg «er tor the SAT) Conia ns examples of each type of 
test Quest on w ih d^ectons cxpianatons and other 
genera' test taking advce and mdudes a sampie SAT 
aid TSWE arsvser sheet correct answers arxJ scoring 
•iStruClons 

• Taking the Achievement Tests (for distntxjton to stu 
dc^is who in.end to register for tfie Achievement Tests) 



E X pta r» the purpose c* (he tests and conia'ns exampfes 
of each typo of tost Quoston w«m d^eC'ons expiana 
tons and samp«e Quostcns 

• Using Your Cottegc Planning Repo't ('or staff use) 
Ma<od to students wdh ^^er score reports Expfa ns how 
students can use the mformaion on thixr reports to help 
revcw the r cofiege sdectons Afso dtw bes the m'or 
maton reported to coHcgcs and how it .s used 

•ATP Oufde for High Schools and d^ieges (for staff 
reference) 

• "Vdd lofvji Report Request forms (for 'students wfio 
need more than the or>e form they receive with the 
Adm«son Tckei) 

• SOO Update Forms (for students who v/iVi to revise or 
update tnfornnaton ttvey provided on tf>c SOO) 

• Schoo< Code Poster (for dsp»ay) Cootim the school 
code number test dales arx) regstrat<n deatfnes 
There « a speci^ ed ton of tf>e pos'er for New York State 
and a speoai cdion for countr«s other than the Un led 
Sta'es 

• PuWcatons Shipment NotCC/Reorder Form (for order 
mg addtorv^ cop<;s of program puWeatons) 



Registering 

Instructons for complete the RcgistratKyi form appear 
j\ the Bulletin and on the form tts<H} The reg<straton pro- 
cv.^ m the naton^ testing program « the same for the SAT 
end the Achteverr^em Tests but students must sutyn-t a 
separate Rcg«stfaton form for each test date The tdentir. 
caton n^or maton provKled by the sfuderits s used to ac 
cumulate scores on score reports Rem nd 9\e students 
to suppiy tdent-fcaton mformaton exacfy the same way 
m aa contacts with the ATP to avo-d delay or error Co^ 
sistcnt -denticaton tnformatcn aJso he'ps coSeges com- 
b«no ATP data with other informaton they receive about an 
appicant 

Special Arrangements for f.tudents 
with Handicaps 

Spocta! cditons of the SAT (m regot." type large type 
bra'tse and cassenc vers«r«) with extended test ng t me 
are ava-tabte 'or students with documer,ted visu* hearing 
physcd! or learning disaW-t^s At^ievement Tests are 
avaiat^e only m regular type txjt ma/ be taken wth ex 
tended tfeJ r>g time Eligible students may lake these tests 
at t-mes arranged by the student and (»unseior 

A secorxJ oplcn IS avawbie for t«k ng the SAT if stu- 
dents have documented learnng d^ab+t-es that a^)w use 
ot a regUar edtcn and a o^achna-scannabie answer 
sheet but 'e<iu'ro additona! testing 'ime In such cases 
students may take the SAT at the reg-jtar na'.onaf admn* 
tratons in November and May at wficti t^no they wfl be 
aSowed up to orx! and a hatff hou'i. of extended testng 
ime Students who test on these dates w* be aWe to order 
the SAT Oueston-and Answer Servicje (see Verifying SAT 
Scores page 7) 

S 



1 



99 



Fo» '«nv o' tnc jfwv. iff irnjoivn's <Ji;d^n)s ^.f < >(i >oi 

ro/ ifjcfrTfs Sp,yn v^^<•f. cjn {« fef}ue<rffvi 

fon> ATP $t.rv<«s 'c m ^^,^<.^,,po<} SL-iX^HS CN 6??6 
PT,ncoton fgj 08541 e^-'^e Of t)v Oi ' nq 609 77 1 7600 

If .1 ds.ii)-if^ ccx^ nc-; «oou re six^c^i a'fanycfronts of 
e>t(nooo {Cbt t rtjo ' It J Jcnts v^ou'd to<i s'Cf fo' me tea 

.n tie rvi'oaV ,wo<j- i-n ufv'oss l-oy -^c^s^ to .rr-ct .1-^ snnf, 



Other Special Testing Arransements 

Spcc.a .i»'ai9en^enf<; .ve maOt- 'o» sfurtcrUs who V rei. 
gcus reasons canno- ta^c luo tests cn S^Mfciay cv ^ho 
'vc more :r>i , 75 rr,'(s !rofn a rog^ij, rc-^oq ccr>;ef anfl 
fw scfvce DefS{.nn,^ wri atxxifO i s^^p at scJ on a 
fwju.a' tes- d,4-e lo .0(}ue<t SfKca' ^'u-^m stu 

' CorrfVeic jn-j suijn i a Pec; orfd!.on Fomi t>/ t^c rcQ-. 
'.V fcqslraton dCrfcJ ne 'Of domes; •€ SfoUonts of he 

t^^c Ua re<J Sfa'es c Puor!o Rco 

? RecofO tesi center nu-nbcr 01 000 as !hc 'fst cDcKe -n 
Item 10 on trie strjton Form 

3 Enclose w ;h tht> Rcq ^'rat on Fof m ano fees a Icncf e« 
pla n t^e reason tor t^e request A Maxme»it s.gnc<3 
Oy a c*fgy rrcnr'ocf must accompany a request 'or 
Sunday resfng A ifatement sgnoa by the co-nmar^j 
■ng o^cof must accixnpany rcoucsts by scfvce oet 
sonncf ^ 

The Coi'epe Boa-d a so provides spccat test no af 
rangcments ^t>en scliooJ sponsored act(vt«s (fof etam- 
pJc an aui'wc compel S>f> debaJe -ou-na-nent b-iryl 
contest; may prc/ont prevousJy rcgsKied cand<Jates 
from taK.ng the tests at t^c regutariy schedued t^me Of 
|>ace Students ix? charged the tesI cemef change fee 
tw sw-tch.ng 10 an a'ter na'e test daje If you kn<r-v o' s-tua 
lo^ that requ fc special a:ierton contact Cof^^e Buard 
AiP (sec .ns<lc front cover) no la'Cf than 10 ca'enda' days 
before the 'est ad mnstrai-on ^ 

Fees, Fee Refunds, and Fee Waivers 

frafon 8(//to',n 1* students a'e absent from a t^si 'or AT>ch 
^ey fogstercd the CoHeqe Board a^- feti><K.* the test 'oe 
(m nus the ser.ce ree) f students ask !of .1 Fee . cannot be 
trans erred Refund req>jpst<i must U> ser^t iv,it^ n tAo 
fTWlhs of the schedii'ed test oa'c Servce fe -s are r-ot 
re'undat>e 

Pec wa vers ar.» ava^at>'e to e< gWo h,gh schc :>> mn ors 
and sonofS vsho need to lake the SAT or Aci' -^emcfii 
Tests byt canriot afofd the 'cs! fee (foe ^avef'are not 
ava-abfe to seventh o g-th n nth of tenth g^ar <-fs ) In 
stead Of the test fee a fee vor card must So sl b^'ttw 
Alh the Reg ^tfat on F«m A; Iho bO(j nn ng of thf SChcS 
year schools and spociai programs such a^ l pwaS 



floun. and commun ty counsel ng a-jcnc es are Sent 
g.Kl.^.ncs on e( g b I fy and a-c a."oca!od 'ee wa ver oa'ds 
on t e Uis s of the numt)er they used the provous year 

fOf .norma! on about !ec wavers or add tonal fee 
nTJ'^ m'? '"'^'^ ^^^^^ Board Reg^jn^' 

Z ^ "^"^^ ^ ^^-^''y ^^b-e bCore a test 
adm n stra^on so that students can submt the r Roo^^tra 
tjonFojniswthfoo.i^rcardsbeforethekJtore^^^^^^ 
(JoJd/ie Sen^ys vshohave rx.-vpf taken me tests ha^e pr. 
oMy or ?ewavrrs E(<,b.o students rr,ayreco.veo^V one 
tee ATvef or the SAT ano/or otk- for Ac^cvoment Tests 
M m,iy used dur,ng e.ther the ,uno, or seocf year 
tec waivers cover only the base test V-es the SAT 
CK*St on aod Ansvver Sery<e and tho SAT Scorc Vfer,*ca 
ton Servce fhoy canry^t tx? used to cover a 'a'e fee 
fn^^^ l/?^ ^Ifonal reports other servcef<^>s or puf 
Chase of The Cot>ege Hanaoook or Index of Mayors 

i oe vers a-o avaiaWe to natona.s of coontr-es other 
mjn the Un ted States only ,f they test m the u S Puerto 
HCO or U S terr tones 



Cumulative Reporting 



I. students provdotht same <Jeni Vng ,n»ofma5on each 
tme thev regsfo< ther reports cootan current test 
scores and a') provous SAT TSWF and Ach«vcmert Test 
scores from up to 1 1 prcvous test dates H < f>oJ possitte 
to send only the latest or r^est test scores or separate 
reports for the SAT TSWE or Ach«vomcnt Tests If pre 
vous scores do not appear on s'jdcnss reports me> 
^u<d «vr.te 10 Co;^ Board ATP (Attcnt on UrvepOfted 



Additional Reports 

students may request add tor*' reports at any t^me by 
comi>ct.ng an Add t onal Report R.^ucsf Form The fee 
for each add t onal report (s $5 00 A form .s enckJsed w th 
the Adm.sson Tcket arx3 a supply « ,nciuded 'n summer 
sh pments to secor>da'y schools CoiJeges or scholarship 
programs may order forms prepr .nted wlh the r code num- 
bers to send toapp(<;arts have not yet sub,r»,tied of1>- 
c^i score reports 

Because the SDO was new ,n 1985 students lesfod 
Dfor to October 1985 who Submt an Addtor^l R^ 
Request Form (arid Aho do nol plan to test aoan) must 
aso complete a^ SDO Update Form m order fc SOO -nfor 
mat on to be reported tocofegcs A supp(y of SOO Update 
t^orns .s included m summer sh prints to secondary 
schools ' 

Add t onal SOO Update Forms can be requested by wnt 
ing or ca'ing Coi'ege Board ATP (See ms do front cover) 

Telephone Rush Request Service 

If a student wants co^eges to rcceve scores sooner than 
usua. and t the scores have been processed (usually 
about throP vvoeks after the test da'e) (he student can call 
Coi'cgc Boa'd ATP (609 771 7600> and request the rj?h 



ERIC 



100 



score fepc^t ng service Sctxes w* be sent to me coV<;e^ 
and sc^olaf$^l p pfogMms spcot«d w-ih n iwo AOf»k<mj 
Oays 3te< me ca'! The stixJorrt w* rccevc a conf .fn.i;on 
ccpv o' l*ie intof^m repoft (wtich conians ID in'o<ma:icn 
a-HJ scopes orJy) jnd w* be b'.'kx) $i500 foe m-s serv<.o 
plus $5 00 for C3cri repon Complofc rcpcts wi;i t)C s*'n' 
to ife <;:uOe"t and coJicges Ov^inj »he rkext scfioou'oo 
p<0C*^ng 

wren students C3JI ihcy shouW p^OvKJe Ocnt'-cition .o 
JOTTViron as recofi^Cd on a w Reg itr.iron F orm the ttXK>! 
fOcen? ;os> date and tMe names .ind code numlxK s o' VMi 
co-'eges and schoia'sh p pfog'aiis v\a\ shooid rLXC-o n 
te* fn report'. 

Automatic Reports to 
Scholarship Programs 

Only students can rc<iucsJ tnat tfief scopes t)0 seni to 
h-grt scnoois o^ieges and scho<a>s^i p p^oqu'Ds Scopes 
tof ati sencws *tx) ancnd high school in ot who rcs^ .n 
FiOT-da a'e roulne<y scni to thcrf states scho»a'sh p p'o 
Qtann Scores for aS lunois tn Pefwjsytvania and fOf a3 iirn 
lOfS >nrinosw^otcstt>etween January : 1987 andJuno 
30 ^937 arc rootnc^y sen! to those states' schoia^shp 
pfogfaTts ir> Rhode Island scores are sent fof ai 54}fKxs 
Aho taVc'ho test n Novcnibe» and Oecembof 1986 and 
Janua'y 1987 In Maryland the most ecenj SAT scores 
are sent fo* a'l sta-e £c^o«afsh-p app>cants H siudeoib w'ho 
I've 'n Of attend schoc* m one ot those states do not want 
ther scores sent to the state scr»otafship agency they 
Shoo!d not ty College Board ATP ON 6200 Pnnceton NJ 
08541 6200 by the appfopnate date ftofda and Rhode 
Island January 31 l987 Maryland fcbfuary 15 1987 
Pennsylvana May 31 1987 lSfx>i August 1 1987 SSu 
dents aid counsctofs tn ^4ew yoik Stale shoukj rc'er to the 
NeM Yofk Srare Edition of the Boffei'n tw a notcc of the 
specai repon ng pfoccdiKcs used for 8t€ New YorK Stale 
Regents Schoiarsh p Program 

Chansins SDQ Information 

Students need to cornptete the SOQ oniy once (Sec 
rtotc tjefort ) If Vicy register fot a sutjsoquent test date they 
can update answofs Howevcf they rnusi answer the en- 
t re queston because therf new answer wfl complete*/ re 
place thcr previous answer For example tf they have 
taken a ca^cutus course %>rce the Jasi i<ne they answered 
me SCO and want to upd<Me ther SOO by wxAidng thrs 
tnformaton they must record all thetf previous math 
courses as we^ as cataoius even ttwogh they recorded 
these coufses the frst time they answered the SOO The>f 
prcvous answers to ai other c^JSttoos w« cor>t nue to be 
reported as ttiey were to high schools and coSeges 

Students can make changes m thc^ SOO at any tlrT^e by 
ca'ling CoScgc Board ATP (609 771 7600) 

NoU Because the SOO was new m 1985-86 sJ jdcnts 
tested pror to October 1985 who wish lo have S0<.) -n'or 
maton reported coflegcs must complete the current 
SOO (it they regster to lest agan) or submi an SOO Up 
date Form included m summci boik shjpnxjnis to sccor>d 
ary scho<^s 



Verifying 5AT Scores 

dOhf. *o vertty bAt vo'os h t SAT \ , n l 
AnsAOf ber/ite and li'o SAT Sift \tr'i i" ' '->«'>. » 
t t"ef may oooftJercd iji> '0 ' vc f'Hii>t*> .!"(.'!*'> 't '<.n'< 

Slud<;nts Afio soke tf'< V»T <> \ < v o' ' vc . t-t • 
•lod n 'n<' Re7'Sf'J'/0') ^-Jv o''V t» « SA' 

QuOi'Ona'idAnsAOf b»»rvt(. fhn %s 'ix <«.»!•» i i a; 
giXiS'iO'^S ttVCOtrcst isiA«rs i->t'Uvtuii. 'ij 
and a LOO/ O' in«'r i'fvAr> bhLVl uSn-i bAT d,i«-s 
<>!u(»oii1s caii v. ''V "'Or bAT v'O'O tw iyd»" • i S*T 
Store Veri'citon S^'vcf Tr>o SAT S< ore V'"*''i^i''-> > 
Vtce i-VudCS ai niO'C rl'S prov.fJcxl rho SAt Ovy' iJ > 
and Answer Sorv<u; <.t^^' IlM ^JJ«^1or^. A w'v 

form S'PUs/"y Vot.f Co'Vy;-' Af-'- n,j ^( p(>f' 

for both i^ie Ouestoi anti AnsAO' Sc'vcc a 'J c 
Score Ver-tcato" Serve*' • t'lu secure caVi >i t»y 
t'le student*; dsaycf w'ti v^o SAT <<;o'c< on v-C^o 
report f^e stwJonii, may rciueM rciif c ^.j u' 'no f a'^bi.%e' 
srioe! t' r(,«sc0' ng C0'>' rm^ th.i' i, ,>r'of rvi.! ty.M>.» r*^ »>' 
(fCSU-t ng o ethe* hsj'icr ot lov^or si,jrc t'\i i ''-os* n 
na'v ref)0'ted) COrroctcJ repo-ls A" t''' son? wt"j • 



Preparing for the Tests 

For siudents to pct'o-m o t»>e b*'J (>f tr-o nb i> f cy 
shou'd know vkhdt tfso tcbl is aboij and I"Oa •! j stfuc.u cd 
riOA to rrviko the rrxjst e"ciCn1 use 0' f rr^^ ' nr^'^ t l' v to 
attack the d ''ereni krvd^o! ai-e-'iorts ^indAriena' ed 
ucated gucssusrvgpfVtai krKJA'edgc ts ujn^t c Foittis 
reason btudents shou'd tx" cncxjuragod it* b'udy the rra'e 
rial m fdii/ng ttie SAl a^J Tsi^tng f^e ^chf'jf^m^n' 7cs.'S 
a'KJ !o complete the samp'c Q<ieslons 'h<if are nCudod 
Schools may choose to assist s!u(l»."nts .n v^q p'ocesa 
through group rreetnqs ard dstub'.un ses^ons to ct. 
phasi/e the -mportance o' th.s prcpa'ator) 

Special Preparation for lh« SAT 

For fTwe than 25 yea's the College Board has 
sored research on the e"ects of svda' prtwaton .-yo 
giams on SAT score resuus and has supported 
-ndepcndenl invcsl-g.it'on of thfi fop-c by ou-ers On 'hi' 
bass of prevjnt knowledge iho Co<te<je Bo-ua ^a-. pro 
pared a slaiemcil to a<.sjst stodcnis, .n ma^i'ig d^c -.ons 
about special preparation for the SAT A rerr ^.' o» V «;ia'e 
ment 'O'lOwS 

• The SAT mea'Uiies do^ciopotj vcrt>a* and m,i:i(e'tui' ca' 
reason ng abitcs thai a'e 'nvolvod 'n njcccbs'u' aca 
dcmtc work m col>oqe 'I ts rtol a test o! sorm' ntHjrn an<.J 
ur>cfiang f>g cA;y^ ty 

• S(x>res on the SAT can charM^c a's yo^ dc^r'op verba- 
and maihcmalca' aWtc^ twlh ri .nd o jf of sct^-v)' 

• Your aWtes a'e re'a'cd to the tmo and e"ori jpof^' 
SJW! term Or "I ar"*0 U inm-rxj a»C i ^dy lu his, j I ll'O e' 
tect 'ongerterTi prejiiratton dovt'ops S'<>i's ,ind 
abfl t es can have grca'cr «. M<.<ct Oii<? k^cd lunijvr term 
p'epaialion is the '•ludy o* thai'cntjny acaiiemr 
courses 



7 



10 4 



101 



• White drM and practc© on sample tesj qyest-cis gener 
a»y f©$u^ in I tto of ect on test saves pfc^raicn of tn-s 
k)o<j can 'amJaf ize yoy w th d «efent ^/pe$ of oue^cns 
and may rejp to redocc your anxiety about whai to 
expect 

• Whether longer preparajon apaf ! (torn that avaifatye 'o 
you with n yoor fegJa' h^ school courses ts worth the 
tJTie, eKOft, and money « a oec-s^ you and your par 
enfe must make yourselves results seem to vary con 
S'deraWy from program to program and »0' each 
person w th'O any oo« program Stud« of spec^i pfcp- 
araton programs earned on to many schot^s show 
varous resu^is averag ng about 10 pc-r:s 'or t^e vo-ba' 
secton and !5 pcnss ♦or the mathemaica) a/er and 
above the average increases that wou«d ov^&w^ be 
expected in other programs resu-ts have ranged from 
no 'mprovement m scores to average ga n$ of 25 30 
P0-n;s for partcu^ar groups Of students or partcu'a' pra 
grams Recent stud es of commerctai coach-ng nave 
Shown a sm<ar range o» resuHs You shouy saiisJy your 
se^f that the resu^s of a spcc>ai program or course are 

I keiy to make a d fference m re<atcn to your co^ege ad 
ni.*$x>ns plans 

• GcneraSy the soundest P^eparaton for the SAT .s to 
study vTKiety wtth emphas-s on academe courses and 
exterwve outside read ng S-nce SAT score ircroascs of 
20-30 ponts resuJt from about three add tcn^ questcns 
answered correctJy your own nd«>endef^t study n ad 
dton to regu'ar academe course work coufd resut m 
some increase tn your scores 

/ 

Testlns on Campus 

ATP tests are available for inst tutona' use outS'de of the 
natonai tesi^ schedu-'e CoJ'eges and unverstes can 
adm n«{ef the SAT TSWE, and Ach^emem Tests on cam 
pus to appfca->ts who have not prev«usJy taVen these 
tests 

Some coiieges that reed to know an appicants scores 
•mmed^tefy for admisscn or placement purposes have 
the optcn of scoring the answer sheets on carrpus For 
Vther .nfwmatioo write to Mu't pie Assessment Programs 
andServces The CoJtege Beard CN 6725, Prmceton nj 



Score Reports for Students: 
The College Planning Report 

The two-page Co.' . je Pianr^ing Report inc''»des the stu 
dents test scores infornr>aton given by the stc on the 
SDO and rnformaton provided by the coiiej,v>s whch 
the student is having scores sent 

The back of the report COnta ns infcfma''0n on jhe scor 
ing process aod the meanng of scores and percent tes 
Accompanying the report « a booklet Usjng Vour College 
Planning Repott wnch expia ns the m'ormaton rece-ved 



by coJ'eges and how it is used it a'so exf^a ns how stu 
dents can use the report to rev^w the r coi ecjo sceci ons 



The Collese Plannins Report 

ffie numbered sectons botow and on page 9 re'er to 
parts of the sawp-e CoCege P'ann.ng Report (pages iO 
and 11) 'or Margvei Wright a fctt-ous student Thesam 
?>'e report has corre^xnd ng njr.bers to odca'.e the oa t 
of tfie report be-ng oxp'a ned n each of the foi'ow ra 
sectcns 

O ld«nttflc«tjon Information 

Much of the <n«ofm^tcn m this secion - pa-tcuiarfy 
sex date of b.rff> and socai secur ty nurrber is used to 
retreve Ma'garets data from ATP f„'es when are stored 
tor t^e Coi'oge Board at Educatcnai Test ng Servce Sub- 
misson of her social securty nunrper is optonai bot '1 v, il 
be used to he*? ident *y her record and add scores she 
fakes ATP tests at another time it may aiso help her h«h 
school and the cof'eges that receive her scores to match 
her reco'd to the r frfes 

9 TtttSCOTM 

Ths socton shov/S for the most recent adm.a^tration 
Margarets SAT verbal and SAT mathematcai scores re 
ported both a^ specifc numbers and as score ranges rep- 
resent ng one standard error of measurement (SEfJl) 
above and befow her numercal scores (Seepage 1 9 for a 
dtscusson of SEM ) The TSWE score but no score range 
'S a'so reported here if Margaret had taken one or more 
Achevement Tests 'nstead those scores and score ranges 
wou^ have been reported here 

Score Ranges The SEM rounded to the nearest to 
PO nts IS 30 tor Margarets SAT verbal score and 40 for her 
SAT nnathematcai score The score ranges are thus 450 
51 0 tor her verbal score and 460-540 tor her matherrvitca) 
score The score range (or the SEM) <s determ.neo tx/ the 
precision of the test whch is greater (or some scores tf^an 
others Por the SAT rrxjst rounded SEMs will be about 30 
po-nts tor verbal scores and about 40 po.nts for mathemat^ 
ea! scores Some SEMs wiH be smaXer partcuiarty for h^h 
scores Presenting the score as a 'ange helps to illustrate 
that the SAT score g^ves an approx-nnaton rather than a 
precise measure of ab-i ty 

Ptfcentiles Margarets report also mctudes percentie 
ranks that Show the rtfatonsh p of her sc<xes to the scores 
of others 'n each of three re'erence groups The percent Je 
rank tci's what percentage of that grcjp obtained scores 
to jver than Margaret s The percem )es sect on ot the score 
tepon compares a students scores w th the 'oiiow ng refer 
ence groups 

• Co-'ego- bound senors (nat^jnai) — an students m tne 
1985 graduating class who took tne SAT or the 
Achevtwrent Tests at any tme wht'e n h-gh <chooi 
Thrs re'erence gro' - -ludes onJy the students in a 
gven years grad class and on-V the most re 
cent SAT and Ach evtn^nt Test scores for eacfi stu 
dent are counted 




102 



• CoSooe-txwnd seoof* {$Jate) f ttf students m the 
state m whidi Marovet attends h^)h school who were 
iri the 1 965 Qraduatino class and took the SAT (State 
_ ^/'F^c^JwarenoigiveftJwAchieven^ Tests) 
♦'WsSonal high school sample - a probabJity saripJe 
01 al h^ school student w the nation, t)as«l oo a 
ipec« administration ol the PSAT/NMSQT n October 
(See Appendix A. Ttole ia) Students m the 
^s?7Pfe,'^e not Imted to those cor»s<Jefing oo«eoe 
"5i^5^1^QS:otote»ieiheSAT; - 

^-^The^c^^&und sertbfs percentile raf*s used on 
' score tepofts are updated annuaty to anow comparisons 
w^h the recent groups d students. Corrvansons ol 
ttiS year^ percenlle rar^ w4h ffiose for pre>wu$ years 
can be made by r^nng to the annual CoAMe^dOixxf 
-Sflrtivs series. - 

Margarets SAT verbal score o( 480 puts her at the 67th 
percent^ among the coXsge-bound seiwrs n the nabon 
(seojabte lOoo page 22). at the 54ih perceniite among 
coilege>bO(tf>d seniors in her state, and at the d3rd percen^ 
jie among al students in the national Ngh school sample 
who i^y or.mayjiot have enroled « coRege 
^isMafOarrtV^T-mattiemah^ score d 500 puts her « 
^tfvaiSTih^ceniite arnong cqBege-bouhd seniors « tr e na- 
- toa st the 40th peroenWe among coJegebourid scmorsm 
her state,^andat the 7«h percenMe among al students m 
- the natohalh^ school sw^ 

Xskie II, page 23. shows the percenWe ranks ol SAT 
mathematcalscwes tor men and women sepyalefy Note 
that the percent^ ranks on the score r^ort are deter 
^^g^m^^ihat combine scores lor men and women 

_ Because Margaret took the SAT and Ach«vement "fests 
durmg her fjnior yea<5 the th»d section ol her report shows 
no( only her current SAT verba* and SAT mathematics 
saxes. 'including her verbal subscores and TSWE scores, 
but also scores lor the SAT sho took fft May 1986 and for 

_three:Achiievemen< Hests taken tfi Ju^ 1966 Res^ for 

:ivp:to.»x:SAT and sm AcWevernent Test administrations 

^maybe'shbwn,^' 
- M^garei took the SAT the first tme as a runor (*i May) 
and scored fower on both the verbal the mathematics 
sectiorw than she did when she took the test aga*» the 
lqlowv>g November: Her new scores, howevet are bolh 
wuhin the SEM (See Tables 6 and 9, page 2t, for a siu- 
den(> chances ot equalmg. exceedng. or deaeasM 

^scorw.)--.—- 

Scores from the English Composition Tfest wuh Essay are 
distrfigmshedon score reports by Ihe ndabon ES. b(rf al 
students receive scores on the 200 to 600 sc^ regard- 
less of whether tf>e form contams aN nxAipie^hoce ques 

-Jid^S:Of,a_combination ol muittple^hoice and essay 




qoostcos. Scores on the two kxms can be compared <*- 
rectfy ^ mterc^iangcably Subscore* are reported for 
the essay part ol the December test. 
"Spccrftc pcrcenttJes lor Mafga'Ot^ scwos^cn the 
A<*vevemeni "fe^s appeared on the reports rece<vcd 
fmmodijflely after tf>e June adrrvneafatw Achew>e^ 
percentoos appo^ m Tabk) 13 on page 24 

N«e Hanasten^appe^nexttoatestdaiemsectcn 
3. a message tixxjt the scores for P);^ test date IS prrted 
imme<Saiety under the secton, " -4^ 



Th« sectcn, compied trom mformatcn Margaret Re- 
ported on her SOO m September 1966 (questions t-tt), 
shows how many yo^ she studied m each ol the aca- 
derrw areas l«st€d, ^ckidmg the arts and mu^whether- 
Ihe coiKses were honors ftxAxling advan^ (jlabomenl 
or accelerated courses), her average grades; and the cw- 
rctAjm covered. It also «K}cates her report ol her grade* 
pomt average and class rank. 

SecDons 4 and 5 ol the Cofiege Ptann^ig Report allow 
Margaret to check that the rtorraton subrr«ed on her 
SOO IS accurate and up to date - ir i- .^ — _ . 

• WwwIoc C oWiie 

ThiS section, ^ corTH>led trom the ^X). ^)Ows Marga* 
ret^ degree goa^, first choce ol majoc^'and the degree ot 
certainty ot her drst choice {SOO quest«ns 70-22) She 
^ cotid have tisled up to tour other ofHchs for her maior 
(SOO Quostcns 23-26) Under Requested Senrces. a siu^ 
dent can mdcioe «)terest ffi education and career^counsel- 
ing and devefonmentS acadernc prograrr^ (questKV> 30): 
Under FVeferted College Charactenst«^ a student can it\^ 
dicale such preferences as focatxxv and reKgous al- 
f»<abon (questons 14-19). Under CdRege Programs and 
Activities the student can mdcale programs and eKtracu* 
ncUar activities that may be ot particular interest In coiege 
(question 31) Margarets advanced placement or exbn^ 
ton plans are also indcaied(questio(L2?) ^ 

9 OoNMet Mil SeholwMp fioy wiia inNirt^'" "z" 
Receivittf A Soon llt^#(t — 

This secton provides information on this coieges and 
scholar^ programs k> whch Margaret has des*^nai6d 
that scores be ient. up to « total of eight Appb^ton and 
iii^nci3i aid deadEnes, address and telephone, aryj the 
coSeges' cr4ena for admssons deci»ons ^ir^den^rv 
portance— h^ schocrf record test scores. e«tracuf f oUa* 
activ<«s. and so forth -areindicated Th«secton7y^ 
Margaret fo keep current w<h tf>e colege application proc- 
ess and aDows her fo consKter her quahficatons compared 
wth at^nsstons prior4ies stated by tf>e coSeges she IS 
teresied n: ^ j-ir-^_ =. 



105 



Score Reports for High 
Schools: The Collese 
Counselins Report 



Ih* saxo f cpoii 'Of high sc^ool$ («uV',v«i on iw^o 13> 
con;ans on a sjng'e 7 by ^ i -ncM !o»tn most o' ^n'ct 

w«» a stixKjws Coeogo Pui'vyo' RtootJ it can hctp ^^c 
ooonsckyrtx:rca$oticstu<Jent$> ./arencsso'lfc range o* 
Oducatonai oppo*tun{o$ avaiat^c After Ihr stuucru 
kMves the 5ctx»» t^e rcpct rema ns a soo'ce o< in'ounn 
ton S>» f csearc^ a-xi st.it'Si'W' tcpof t$ 

Tho Co»'<>gc Course* Ropo»t conki^s nx>« o* tne 'r> 
tOf maton found m t*^e Co*'<?ge P<,ian<y; Rcpct w m .-xjt 
variaton$ -o (crruiftrKj (Note o^il the Co>'ego Counsct nq 
Rcpoft tnckxJes the staoeris test ccntci zx)o from the 
most recerj j<jm.nistra;<)n ) Fof a dicoVv-on ot ih«' MX 
tonsoo <Jcr«fcaton rf>'o»fTV»ton tost vx>res odocatona' 
bac^O'C^f^ coKcge pais sec pages 8 afxj 9 Sec 
ten i *h<h 1$ ua-gue to the Cofioge Counsel ng Repott 
^st$ tr.c co^cgcs arvJ scho^i'sn p pfog»afn$ 'o a Jo'i' o' 
eghs to Ahch the student scores 'rom the most -e 
cent test ng 



Using the Report 

A connpkVisofi oi hef ol>,-oct.ie eva^wcnj arxj her as;v 
fdtsjns a'l ksted O" the report can hcp the counseJo? 
piobe Mafgarets rea$on<i tor her slated educai-ona' 
chores and it rxxcssa*y tcad hef to some a tef a* on in hct 
th nkrng W<h the reports hefp the counst Of can ask J$ 
she a* are o* hof pofentta." Is she 'goo ng spcca' ta'^ents 
ot tnteffst$'> Mas hef h<}h school pfog'aTi adeouatcty pf e 
pa'cd her to» the coiiege reg ry^en st>o pians to pufsuc"' 

The coonsetof cai aso xictefmnc whether Margafets 
interests and pfe'erences a'e f ejected tn ttic cowegcs to 
wt\<h vho IS scod ng hof scores Do they refect the sire 
locaton or rc^^g^ous a**) a*on she has mdca'ed' Do they 
Qt*cf itM? types ot ac.idem<; and esfacwfcutar programs 
t\Cf hptfi schooJ ca'ecf re''Octs'> 

Finasy the fOPCft can help the counselor mon tor 
w-heihef Margaret has had sayo fepofts son* to the coi 
legcs that mtetcsi her 



Exptainins and Using Score Ranges 

Hef coohseky 4 .n ti ^ best postcn to hoip Margaret 
and her tanMy undo* stand the nrtean.r»g and I m tatons ol 
lest scores The corcept that an SAT o* Achievement Test 
score can oh<y approxirT«tc<v eva'uato Margaret s aW.ty .s 
d t?<u!i to grasp w-th a spec'c fHjni^rica) score The V suai 
i/aton o1 the standard error ot measuremeni- score 
ranges - akjng w.th the ex f^anator^ on U^o back o' *ho Col 
lege Pianftffg Report should he'p 

Aty^ough tr>e prec<se score « not an .^osoiute represen 
laton ol Margarets abJ<y thfc rar^ge a'ound ths score 
tends to bo a very good measure Urvess there 'S a com 
pe«ng reason to ih.nk 'hat Margaret had some spoci.^ 

12 



prol>em*>:hagvontest 4 is kc<y that he» score wckAJ 
fa» rrvuch outs'do the or-g rvV scxxe ra'^e ^n a few months 
tme Cour«c*ors can use this knov^^eoje n start>f^g early 
coi'ogo piannrf^g w<h |uh<y test tai>ers and m advi$-ng 
atxxtt retesling cspocrfty if the student .>a$ takt^ a lest 
several t'n^ w vt\ s-mJar results 



Additional Counseling Materials 

As an ad^xi to the Cofcgc Piann-og and CoSege 
Cotiisei.ng Reports Margaret should bo erxwagcd to 
a^aJ hcrsoH of the many servccs and pubfcatons do- 
SfgrxxJ to he*p her and he< ?am/iy plan for cofcge Ma'eri 
a!s produced by t*>c Coioge Board -r^iude The CoHegc 
Handtfook. and indef of W^ofs StxyeScnse"' and Cooegc 
LxpiofCf'' (m<:r, Computer programs) aud-ovisuai k-ts 
arxj numerous pubJcatons For a complete iiSt>ng consul 
the ^986 87 catalog avaiabte from Coi'oge 8oard ATP CN 
6?t? Pr>nc«On NJ 08541 62t2 



Score Reports for Colleges; 
The College Admissions 
and Advising ,teport 

The score report tor co^eges contans a wealth of m'or 
matoTi atxxit poter^iai candidates *hch can bo used be 
tore and durmg the application process and a'ter 
crifoHment tor f^acemofitandddvisng The name Co* 
lege Adrivss<x« and Adv«s»ng Report reflects its use as 
rrxyo tfian a source tor ic^ scores 

The two page form whch tolds mto a startdarrt fAj with 
the students name runrvr>g across the top tfKiudos both 
scores and sttxient descr'ptfve mVymaton comp^ from 
tfe SOO The other coKegos if any to whch the student 
had score reports sent are r>oti^ed A sample reports on 
pages 14 and tS 



Tne College Admissions and 
Advising Report 

O id^ntincatlon Information 

Sex date ot btrth. and socai security rwrnber (optonai) 
»<lontty student records If students request thai updated 
reports be stn to cot<ege^ at t<mes ooie» than scheduled 
release dates corripar^ng the tepon date date of SOO 
xvi the tc^ da'cs he*ps to determine w^ch mJormaton « 
current The ttfephono number enables adm«ssjoiis offi 
c^s to communcate quckJy with a studeru for recru.tment 
purposes or tor foitow up on an mcompieto applcaton 
The state and county of residence provxie heip'ul tnforma 
ton il the coJiogo waits more gcograp^c d.vefs>ty or J the 
slate or ttie county « or^ m v^tich ^ coJiogo mterxJs to 
co^ucl greater recrutmeni ac*,vity A report showing tftat 
the sludents address defers from the students tegai f«s»- 



ICD 



106 



OAOOtSStONS IlSTtNG PMOOMAM 
The CoNcgt BOMd 

SCORtMronTfOM WNMCT X WICMT 



COLL€G£ COUNSEIINQ ftCPORT 

jifMisoH Mtrorui hicm schooi 
ssssss 



7^ 



ItM TXCailLT U« 



TisT scorns 




NQVtt«{l tm SCHOUSTIC tPTlTVDt TfST 






Ini 




TOO 


Joo *0P lOo w ru^ 










SAT V 


m 




«<»> 




54 


83 


SATK 


500 




««»» 


57 


40 


7t 




« 






66 


52 





e 





&UUMAHTO ll&ISCOnfS 






fry 


































us 


il 


An «4 


nth 


fN 4S0 


BT SS8 


nt sso 



ttllOITtQ ON SnCtMT QtSCaiPTlVt qutSTIOWAHIt n/Hi 



e 



C<X« I Cf S • HA T M Ci IVt O A «:<»€ HI tVM 1 



cm coiucc or mt 
Auu mnv 

ST. mOtAIL'S COiUW 



12}* 
l«IS 



MATmIMaThs 

SOtlAl. V I 



&«v««Nt&f>ANs tttraiuo (» 


i snntNT otscniPTivt oucSTionuiiri n/fti) 










Arts! Vltual and P*rtor«lnt 


Vary <*rtatn 










tdUCAtteChtl Dt«nn 

fart-ttM lob 








Art 
Dane* 

Ora**/Tbe«t«r 







ERIC 



liO 



107 



tScSiSJ'Bi'.r'*'^ COlieOe ADMISSIONS AND AOVISINa MPOW 

cm comet or mt 

ttM 



•'O *»>»»» 
NAMWIT K MIXmT 
ItM TXCtnXlT UNC 
CNICA60 It »HOI 








S«nt*r 


Art iver 












:u^..-.r 1 


ln«M»h anlv 




C^ Il-n luthM^«n Church In AmtU* 





SAT V 
SAT N 
ISlC 



480 
500 
49 



«<»> 
««»» 



67 
57 
66 



n 

71 



0 




0# TtST sco«t^ 










S*T 


















) 








m 




J? 


m 




An M 


Mth 




•T SOO 





O 




IRIVOTTID CN »TU)IKT DISCRIFTJVt W»TJa«U»I ll/MI 




C«vi*t 












ARTSANOMUStC 




Tm 


A 


fc:t3^AKi^:;:«jTi2:!Ts.t?'iM 




tN<lll$H 


t, 


Tm 


B 






♦ 0«tiGN lANGUAGtS 






• 


Fr«neh 




MATMtMATiCS 




Tm 


A 






NATURAL SCttNCFS 






e 


•UtvwiOwaUtrv 




SOCIAL SC'CNCIS 






A 


UtiUr*""''*' *• *«t»t**«>^ mil.Morld MUt, 












rr«tr Malm. n>th, Hard fr«c««t>ni 










Ci*n«*'^ 3«c«nd t»nth 



^^^D«MAtK>S^«OvlOtOe>' JIFFIRVM KCMWUt NISH SCHOOt 








IISSK 










«!!A5sr^ifm8r 




















Ur«« tMv 


7 


Tm 


IS 



111 



s 



108 



tKfvno AN rvoexr oe»mtvc qucsnoiMMtc ii/m) 



>«GHSCMO<X»K0COMV\)»«t»*CT yiTifS 



*«i to* 1H« 



>n<—U taw Malttv 
Art Mtlvltv 

•tlttlM wtlvUv •TMnlntlen 
1lMt«r Mtlvlty 



stuMNT&ptANs tRfPorrto 01. snjDEKr Discttntvc «ic9TtQ**u»E ii/hi 






























ns: &.'u'KUt:n!m'i. n.m. 


Art 
D«nc« 






Drtw/THMtar 















112 



109 



S.^.^ »»^5rudefii needs to compTrto <tocufw«s to 

sh p status can bo uscM kx KJcntJying students cl« Wc to* 
&3vcfnmcn< sponsored trfwoarf oppoftwvt« such 
^ns Qfa-^ts and work study pcograms. w fw requeuing 
add lonsl docufncocaion A studcms rci^jous aff* Jton w 
fxe c«<:rtcc and «hn< •dent<y *h.ch arc «dcated ortv ' 
;eQucsicd t)> the coucgo on the ATP Score Ropori Op<ons 
order f<yfn m.iy bo iised m spec«»h«d fecru^men? For 
twmpic coaegos may vi«h to mJorm studerws ol the av34 
a>i y of sprciaiimwesl groups on campijs 

© TMtScortt(mColl«8«nMninoRtport) 
• S!J2I!liry Ttrt Sew (m ColIf9« 

nVpOft) 

O Educational BMfcgrotmd 

Tho s<St reported ilorro m tfus sccion oJ mc report are 
good indea-vors ol a potcnfcaJ tot coBegc work Grades 
cJass ra-J^ honors courses, and expected years ol study .n 
certan s*rtv«s al refect tho student^ rtercst m team r>Q 
and response to tearrwg opporfuo4«s There <s also a 
secion wth specie >nJoft7«t«n on the students course- 
Aork and experK>nco For cxarppte for sorDo »nst«ut^ or 
courses ot study >» might be esscntwH to Know that a stu 
dent took caWus not just fouf years ol mathcmatcs 

Honors courses may bo considered for pJacemeot or 
credt Of as examples ol motivation Adnvssons offcers 
may gve extra weight to good grades m honors courses 
because they probaWy represent a consderaWe levet oJ 
achevemcnj it « also usefii tor adm«s<>ns oncers to 
know about the school tho student attended tf for exam- 
PC a schooJ offers a number ol Arf/anccd Placement and 
honors courses and a h-gh percentage oJ scows atter^ 
cojege tov\o» grades and cJass rank m^i be dcceptawc 
The toUJ hgh school program and grades can be con 
s^ered >n con^rKton wtth test scores to determine 
whether the students mtendcd majors and educatcnai ob^ 
)Cct.vesafcreaJ«tc Ha students grades scores, andhoh 
schoo* program derrxywlrato ab«y n malhematcs and 
sconce, "or exampte. « the designated career one m 
whch these abAt«sw« bo imporianP Has the student ap- 
p^ed to coTeges where tranmg m those areas « ava-iabto' 
po^ the student have the subjod matter depth necessar y 
(or tho career or ma;or«dcated? On the other hand Jthe 
student >s undecided about a career or a major thecoKcoe 
may want to .nform the student of acadG».«c and caw 
opportun.tes that are compatWc w«th the sert reported m 
tcrests and grades 

O High School Information 

The rfiformaton reported « ths sccton « provided bv 
the high schools ' 

O High School and ComnHinltyActlvitlM 

Margarets extracurrcuiar experiences - tnteresis ac 
tivtes out ot school learrwig commuaty service and me 



[, \ ifJ^'" ' O'aphcs that show how kxxi 

Margaret has boon .nvo>ved and how curfcr^i her *^e.'es^ 
a»r If she roce-vod honors or served as an offcer th« « 
a;so rfxJcatod 

•.® Spofta and Studant*a Plant 

Student responses should be shared with campjs 
groups that m^j wish to forward approproto tfiformaton 
Students reveai ng an Merest m sports rmght wetcomc a 
schedule of .ntramurat athJetc events Siuderits express 
•ng intry est .n parte pat*>g m other act<vt« coukJ be sent 
co{>es of the coPegc newspaper »«ts of cJubs programs 
o. cuPuraJ or refcgous a«Art<Js or socwtf schcdukjs as 
seems appropriarc 

Advanced ptacement or cxemp«x> plans can be Icokcd 
at to see ^ they are consistent vwth the students hoh 
schooJ grade m the same sjibject to plan for cufr<Wn 
and Stat or to tfxJcate poss.«e recept ol separatety re- 
ported Advanced Placement (AP) Program or CoCeoo^ 
Love< Exa-mnaton Prog'am (CLEF) scores In piannrfw 
recruimont act.wy adm«sons offcers nvght want to reex 
am ne how much crcd< the coBege offers tor exam^ton 
and ho-* e^eci-ve<y prospoctrtw appJcarrtsar e ^formed of 
these oKvortun kjs Perhaps more students vwxW be at 
tracted .f they could begn stud-es at h.^ feveis or take 
rtxye aaxHerated courses 



Using the Report as a Marketlns Tool 

Students can send up to four reports to coBegct as part 
o. thcr bas-c test fee The mformaton about students on 
t^c reports can hcfp coscges 'ocus ihor markel^ig and 
recru4fnentact.vit«s F*st It enables a coKego to develop 
a va oatfe t.st of prospects made up of students wf>o not 
only are aware « the colcgo but have demonstraieci a 
certa^> level of micrcst 

Then by match^ a few student characterfstcs Klent. 
f<?d on the report w^h high prxyty market^g interests of 
the college each inst^uton can develop key bsfs for re- 
crutmeni The market dcsignaton from the EnroHmonj 
P^ngServccismckidcd in the tdcntifcatoo block The 
EPS market can be used m cor^ton wih Enrollment 
Piann^Sofvco data to classify students who subrm SAT 
scores into EPS markets for foBow up and eva\*aton ol 
recru4<ng for instance »)e mformaton on tfie reports w« 
aBow coseges to <}entrfy qucWy students soekog specai 
academe programs those k»kjng for colleges of a certan 
s^orkxaton or those w<h certan extracurrcuiar experi 
ericcs or h^ school academe records 

Coseges who rcccM? score reports m tape format w« be 
awe to use tncr computers to separate students by much 
fTxxe detaied cr^cna 



Using the Report Before Enrollment 

Adm ssons offcers can u<o the cxiensrvo mformaton on 
the report to assess whether the coifcgo offers wtiat the 
student .s seeking and whether there »s a roasonabto 



16 



lis 



110 



drarco ot adrrvssoa Recf ix ^ d pole^i^ appfcantj 
can bo mote <^cc«'Vo and efTci '* if sJudents are provided 
ir^fof maton that rs »a*cxcd to » plans, irtefests arxl 
prevous pfepa/aton as feve3>cd & the? score fepwts 

In the adrrvssorts process tfwATF- •port — often w<h a 
copy ol Iho h-gh school transcript, tbt xmpiefed coRege 
app^caton form letters Qi recofnrr«nda vi. and personal 
interview cvakiatoris — «s part ol W>e aO "ssions f*e the 
bas«foflhodooson«yhethoro»nottoadm fiestudcnito 
the msMutcn Although the high school rea, d may play 
the stror>gcst role in thfs dec-SKXi ATP scores . 'e partcu^ 
la^^/ useW because they provide a comnrxx) yk isicK o< 
acaden:vcpolcntaHhatiSrfxJepender«ofthestjdfc Ispar 
tO>iar curriCuKim, high school or rcgon 



Usins the Report After Enrollment 

After adrT><ssK)o the College Adm^sions and Adv»Sing 
Report hefp^ m makir»o course placement ass»gnmefHS 
Swough the use d SAT TSWE and Achievement Test 
scores and the resporvscs to SOO questions on subject 
maner preparaton (suc*> as years ol study and courses 
taken m high schooO, plans advanced placemcr^t and 
needs for speoal assistance The report can ^soacqua>n( 
coBege personnel with the characteristcs and rfwerests ol 
the student they wl be cour>sei-ng 

Data v^tfe for ptanrvng mat tnvtfves estmit ng fac 
oty work kxads and the demand on physcal fac*it«s can 
be extracted 'rom the report Knowir^g stoder^s hous^ig 
pretcterx»s can help ttie housf^ offce gauge space 
needs fo* the corrw^g year 

Data If) th« Reoucsied Servccs scaon gtve an carty m- 
dcatoT) d the krfxJs of services that if^onur g freshmen 
tfvnk they w« need Oeparimen] heads and special ser 
vces personnel rmght wdcor^c the mformatoo for piarv 
nriQ for curro/um and slaJf e^W'^mg read^ng sk«s or 
Cher de/eiopinent^ centers determ-nog whch coons^ 
mq areas ^looid bo strengthened and d^^-ng ava'a 
ye work study opporturvt'CS 



The Reliability of Seif*Reported 
Information 

In makmg decsions based m part on se»f reported mfor 
matoo, coiioges wii want to know how reliable such data 
a'o EvKJerce shows that student reported mformaton « 
often as va!<J (or md^fidua) educalonai dec^ons as mtor 
ma! on gathered t\ moto expensive ways if as m the 
SOO, questions are carefuly worded deal wth matters 
that are relatively recent occurrericos perta^^ to current 
concerns and interests and can be verified answers to 
them may be used with a degree of assurarcc (See Us'ttg 
Self fleporfs to Predict Studen! Perfofmance ) 



Understandins 
College Board Scores 

CoSege Board scores for the SAT and (ho Ach.>evcmen» 
Tests are reported on a scale of 200 to 800 Thecho<»ol 
any score scale becomes n^eanmgfui only as data are 
ccmpied from the scores ol varous groups ol students 
taKiig the test Users learn to urxJerstarxJ and appreoato 
the mearvng of a score ol 430 ir) tf^e sanrte way that they 
ha/e learned to urxJerstand and appreciate the n^afv>g 
ol say 14 inches a process that « possitjle only 4 the 
fneasurir>g urvts remam coristant. In ihia earfy years ol the 
SAT the test was reseated each year so as to provKJe af» 
average score of 500 S«x» 1941. however, a constant 
scale has been used that « maintained throcgh a process 
known as equating Scores on each new form of the SAT 
or the Achievement Tests are calitxated agamst prior 
forms As a resoii, d«tterent forms of a test such as the SAT 
view scores on the same 200 to 800 scale, therolsy erv 
*»)iing the user to compare scores d students wtx) take the 
te, ' at different times For example w^hm the Imts of the 
equVifig methods employed a vert>alsccyeof430onthe 
SAT tiday represents the same level of developed vertJdJ 
ab*ty « 1 did severe years ago Therefore, students' 
scores V in be compared from or« a(AT>r)«tr3ton ot one 
year to at vtKt 

tn add>tK>n to cafttxaton against pnor forms, Achieve- 
ment Test sxres are mair>ta<f>ed by percdc rescuing 
studies m ordtt to make scores from d^ererx tests roughly 
comparable Because rescaiiog may affect the piacernent 
d the tests CT the scafe. some year to-year d^ererKes « 
scores may t» due to rescaiir>g as we« as performarKe 
Achevcment Tests have not beef> reseated smce 1 980 

Equatif>g procedures are urxler contmoous review A 
1976 study verified mat the SAT sc^ had not sMted sub- 
stantiaty between 1983 and 1973 and that the score d^ 
ctoes reported nationally cookJ not be attributed to a drift 
m the score scale 

Separate verbal and mathematcat scores are reported 
lor the SAT onthe 200 to 800 scate SAT verbal subscores 
for reading compr^enscn and vocabulary are reported 
on a 20 to 60 scale (Note that averagir>g the two sub- 
scores arxJ muHipiyog by tO does not result «n the SAT 
verbal score) 

Reading comprehension sut»cores are obtained from 
SAT reading passages and senteoce connpletoo ques- 
tions vocabulary subscores come from the analogy and 
antonyr^ questons Because readmg comprehension and 
vocabulary a'e dosely related the diffcrerce between the 
readrig comprehensioft sut>score and the vocatx/ary 
subscore for a student has tow reiiaWtty Only when the 
deference tx^twccn the two is as great as 9 pomts can or>e 
be certatfi that there is a gernj no differerx» m the abitit«s 
beiog measured 

Scores on the TSWE are placed on a 20 (o 80 scale 
however because the TSWE is r>ot mtended to dtstiogush 
among students whose command of standard vwitten En- 
g'sh IS considerably better than average the nraxirrwrn 
reported score is 60 -f 



17 



Raw and Scaled Scores 

Bwrd scale, raw scores- mo$l bo oblafied Eachcoc^ 
*tt*cf rece,ves 000 po«« and fX) po«fe ar 0 ass^^ 

a pofw « sutxr acted toe each wxreci f^oreTonetflf d 
o^P^«sub(ractodtor-xxx/octfespSSsto<2^ 

44r.ghi^ V4(32 wrong) -36 

Raw scopes lor each new form ol a tea are Dtaced on the 

Shows the f et«onshp ol SAT and TSWE ,aw^es to 
Co.^ Board roporuig scale (seated 

""^^^ a <i vanaSir^ 

dif<xrtyffomonocd<ontotheoert Therefore a foww 
raw score « needed on a rrwe tidfcUH kxrruo (J-SH 
g^en sca;od score than b needed to get mat ^iS^o 
^ancas^f form forexampte « TatSTl^l^^tf 
35 on a rrxxe dflcuti torrn 01 the verbal secaons crf^SM 
wfl produce a scakxi score Of 430. wher^^okcr ,Jn^ 
wflre«/t ma scaled score 01420 ^^"cascrjorm 
A seated score of ?00 does nol necessarlY stand tor a 
mnmxn raw score. 4 is the towest^TrttSd TnJ 

feet raw score 4 « the highest score reported 




Measurement Owracteristid of 
ATP Scores , 

Anafy^s that provOe mformaton about the measure- 
d^racter«t<» of ATP tests are performed regutarty 
for each new form The data oblarfwd from each leaanaJ; 

'^ormaton about the testes rehabtiy the dA- 
ofy and speededness for the group tested, and the 
•rtercorrdaton of scores for the test components 1ables2 



112 



^^^^^^^^^^^^^^^^^^^^^^^■f Tnc rc(^Mty coctfc«n!$ yx)wn >n Tabic 5 for t^e one 

^^^^^^^^^^^^^^^^^^^^^^^^^^B hour Achevemer^ Tpsis range from 66 for Mal^«ma!c$ 

Level I to for Gcf man 



U> 4 prov^do some measuremcr^ cfi^acicr<5ic$ lor three 
cecef4ed*on*oltheSATafxJeicTSWE Tatto 5 provnjes 
smiar Worrnaton for tfie Ach-eveo^ent "fesis 

The prects«n of af>y tesi score « (mted because ii rep- 
resentf onfy a sampJe of aH the poss^c quciloris tM 
cooW be a^ed arxJ because peopto perform ai (J.*tcrert 
levcte at d ttefer^ times for rcasorts urveiated to the chaiac 
leristcsoJlhelesirtse*! 

The cortsistercy with vrf^ch the test measures ifuc per 
formancc « expressed as a reJaW-ty coeffC'cra ft md' 
cates the extent to ^ch an »n(jtv«Jgal wowki ach-eve the 
same score oo repeMon of a test A tei-abi ty coe^oeoi o! 
zero irxJcates no r(Hat<x«h.p whatsoever between a siu 
deht's reiatve staryjing wtthm a group on two torms of a 
test wtw eas a reJiabitty coetfcent of 1 00 mdcates per 
fed r^t>ky — studef>t$ withm a group rank exacWy the 
same on the two forms 

The r^b*ty data for th© SAT scores m Table 2 were 
defrved (rom rtem response theory (IRT) estates ot 
standard errors of measurement (see the next secion. 
Standard Error of Measurement and Appendix A) Therfr 
liab*ty data for the TSWE and the mutt pto-choice Ach«vfr 
mem Tests «cfuded>n Tables 2 and 5 were c^ta ned us*ng 
the Kuder Rchardson Fornx»ta (20) wih the Drosset adao- 
tatoo for formula scored tests Both rehaWiy cstmates are 
•nfJoenced by d^fd »jnces >n exanmnee pf rtorm;iPv:e doe to 
the sample ot qoestons selected and the degree to whch 
the questorts vary >n content The esi^nates do not ta'ce 
mjo account day taday differences <n exanynee behavor 
or dfferences m adm^rvstraton environmerit 

The rel>ab44y data for the SAT and TSWE are based on 
statstcaly fepreseniat«/e samples of junors and serxxs 
takjng these tests The reiiab*ty est«m3tes lor the verba! 
and mathematcal sectons of the SAT comprtsiog one 
hour ot testing time for each component, a'e typca^'y 



Standard Error ot Measurcmcni 

The most rc«^is:c way to a^ow for the efects of norm^ 
variat<x« tn the physol and cnx^crytf corxHons o! the 
^vividua^ the test %cnrtg or the test cor«cnt ts to mterpret 
scores 3S ranges rather than as ponts Tiie saTc test or a 
different vers<x> o* a test taKen on dflerent days woufd 
protMWy res^^t tn a si-ghby afferent score each t me if a 
studertt were to repeat the test an irrfirx'te number of t mcs 
a nunnber of dffcrcni scores woukJ probably be otjtaticd 
sonr« higher some tower but most woiid tend to cHistcr 
about an average value Tfijs average would te the "true 
score the score a student wJd earn tl the tost coi^d 
measure abity wth perfect rcl-at>i'4y An irxJex of the ex 
tent to whch students obtaoed scores d^er from ^e^ 
true scores is caJed the standard error of oMjasuremer^ 
(SElvl) The SEM for a given test can vary at d-*fefent 
P'aces on the same scale for exampie the SEM 'or SAT 
verba' scores « approxima'c'y 30 po<nts for scores 200 to 
670 and about 20 po-nts for scores 680 to 800 win the 
average SEM bemg about 30 points For SAT 
maihematca' scores the va'ucs for the SEM are approx* 
mately 30 po-nis for scores 200 to 330 40 points for scores 
340 to 530 30 po-nts for scores 540 to 700 and20pon', 
for s«xes 710 to 800 This results m an average SEM for 
the SAT mathemato! score of about 35 parfls TheSEMs 
reported for each ATP test are ^n effect an average of the 
SEMs for that test The SEMs for ATP tests are reported « 
Tables 2 and 5 

The SEM for the SAT verbal score « approx-mateiy 30 
po>nt$ on the 200 lo 800 sca^ Th s means that twath-rds 
of tiM} studerus takmg the test witt obta n scores w-thin 30 
pcnts above or 30 po-nts betow (one SEM) ther true 
score For examine if a student has a true score of 430, 
the chances are about 2 out of 3 that the stud<^ wiH re 
ceive an obia ned score txtwecn 400 and 460 (430 plus or 
m nus 30) The score ranges shown on the score reports 
tSustfate th« corKept for students couns^s and adnvs 
sons off cers 

The SEM for the SAT mathemat<ai score is approxn 
maicfy 35 po*nt$ on the 200 to 800 scale For most stu- 
dents this will be rounded to plus or mnus 40 pomts on 
the-r score reports The standard errors of measurement 
lor the SAT verba) subscores are about 4 4 po<nt$ on the 
20 to 80 scale for rcad-ng and about 4 6 po<nts for vocabo 



113 





•ary The standard ertof ol frcaswement tor ffie TSWE « 
^i^F^^ ^ *-nu«<*Krf)o«o Ach«vemeni 
fes». f>e S£M range e from a low ol about 24 po.nJsibf 
tfw H«bfewandSpafvsh Test J. toah^ ol about36 pcMite 
fof Rie L<«f ature Test 

Standard Error of th« OiffcierKC 

Uscfs ol lest sew es ate ad/sod agamst maktig t,no 
wxtons txrtween scores The standard error o» the dJf«r 
ertce v»hch « reported « -ab»es 2 and 5 mdcates iho 
nofmal var^ton to be expeced between the scores ol two 
peopie CO the same tost cr tests taken at two dlferent 
tjnes by the sanrw person due to measurement error 
awe Score d^erencesolkss than t 5 tjmes the standard 
error of the <Jttef ence have I ttto significance For example 
the standard error of tie difference for the SAT 
niathematcal scores »s abot/f 48 Orty when scores drfter 
by more than 72 po«>ts (48 ^ 1 5) can there be reasonable 
ooj^<Jence that aie ab*tes be*^ measured genoineJy 

Speededness 

Deta^cd test analyses suggest that ATP tests a;e ro(a 
wety unspeeded for the m^oftfy olthe students t(>sied and 
that most scores wouW not appteoaWy improve if owe 
time vyer e aHoned 

t^m'^^ ^ con&dercd unspecdcd rf vrtuafty aR ot the 
students takmg n compieie three- fourths of the qjc&tws 



andSO pefcert reach the last questiba The percentne 
completiog three^foorihs ol the test «s a more rekable «>■ 
cator than the percentage complet*ig the queston 
because the last question « often drffcull and students 
may be oovttng rather than not reach«g this qoestca 
.J^S^ '^^^ "he test sectons ol three forms ol 

the SAT and one form ol all but three Achevemcnt Tests 
were sightly speeded tor some students. The percentages 
cornpfet^ng thieo-tourihs ol the separately tmed SATand 
TSWE sections ranged from 97 to 100 (see Table 3) 
TaWel) ^ ^ was 96 to too (see 

Another tfxJicator ol speed tfxAided « Tables 3 and 5 IS 
the average nurrtjer ol questions nol reached by each 
sampfo ot students For the scpaia^y tvned SAT verbi 
SAT^thematcal and TSWE sectons. only one to two 
qt^estons. on the average, were not reached by the repre 
s«ntative group oljoncrs and senors taking a>o test The 
average number ot questions not reached on the Ach«ve- 
ment Tests ranged from less than one tor (Jerman, Lat« 
and Literature exammees to approximately three tor the 
students tak*)g American History and Socrf Studies. He- 
brew, and Spanish 

Intcrcorrclatton of Components 

The *n<.rcorretation ol the dJtterent components of the 
SAT and TSV/E lor a typcal form ts presented « Tabto 4 
The ccrrtfaton coeffoeni between the verb;^ a-id maihe- 



fliatcal scores <$ approximate^ 66, between ihe feaoi>ng 
con<p<ehefts«oo and vocabulary subscofcs. about 80 The 
level Ol correlaton fof verbal a/xJ matfiematcal scores $09- 
(jests thai there «s some overlap m thj lotorrriaior^ pro 
VKJod by these tests Ar> ever* greaier degree o<o\«!ria{)»s 
*xJicaled for the verbal subscores, suggcstog that orty 
exceptional cases woukJ studerto obt*n very d-ttererw 
subscores The corrdation coefftocnt t)«tween the 
muttipio-chotce arxi essay components ot Ihe EnQl<sh 
Compositon Test « 45 The degree of the reiatorvshtp ts 
Iflvtcd because there «s only one essay questxxi on i^ch 
to sample wrilir)g ab^ But even if there were nxxo essay 
questions, the corretatjon tctween the two par Is would not 
be periect because some unique as weB as seme cocTvnon 
skis or abijtes are be<r^ measured This feiatorish<p and 
those ncted lor the SAT and TSWE are typcal ol othef 
forms of the ATP tests 



Repeatins Tests 

When students take teste more than once the* scores 
usualy change Th« change may be due to the practce 
effect to academe growth, or to other infkjences How- 
ever; the most powerlul influence on th« ctiange «s the *t>- 
precjston inherent m test scores, which, as noted 
previously. <s *x1cated by the standard error of the differ- 
ence when two scores are compared Thus score n- 
creases or decreases can be as large as 1 5 to 2 standard 
errors o( the difference and sti not mdcate any real differ 
ence in the studert s ab*ty 

in the case of the SAT an appreciable rise tsunMtely On 
average, studertts wtio look itie SAT as junors «t sprng 
1985 and aga« as sencrs m fai t985 ^xxoved the* vcr 
bal scores by about 15 pomis and thcr math scores by 
about 21 points Furthernvxe tor those students wtxjse 
scores change when they repeat the test, about 65 percent 
have score increases. w^<e about 35 percent have score 
decreases. The higher a students ir^.^ scores, the greater 
the probabtWy that subsequent scores v^** be towet The 
lower the trtfliai scores, the more Mtely the subsequent 
ones w« be hgher Among students repeatng the SAT 
about 1 20 ga^ 100 or rrwo pomte and about t m 100 
toses 100 or more po*^ 

Tables 6 and 7 show the percentage of students wih 
jumoryear PSAT/NMSQT scores at varous levels virfw 
earned SAT acores at various tov^ TaHe 6 refers to ver 
bal scores. TaWe 7 refers to ma&K>mateal scores For ex 
ample, the numtier "3" m the top row of TaWe 6 means that 
among the students with junor year PSAT^MSQT verbal 
scores of 68 to 72. approx^nateV 3 percent earned SAT- 
verbal scores of 550 to 590 m their jonior or senor year 
The percentages tn each row add up to 100 percent (m 
somecases plus or rrvnusi percent due to round*>o) Tt^e 
coiunnn at the right sxJe of TaWes 6 and 7 shovirs the aver 
age SAT score tor the srudente with PSAT/NMSQT scores 
withm each specified range 

Tables 6 and 7 show thai the students junKX year or 
senxx year SAT scores tend to vary in txjth directions from 
the* PSAT/NMSQT scores Students with PSAT/NMSOT 
scores m a 5-po<nt range (corresponding to a SO pant SAT 
score range) may earn SAT scores that d-lfer by 200 or 
more potfMS Th^ tendency »s si-ghiJy greater for mathe- 



iia 



115 



rr^jca! scores than f<y vof Ixn scores 

rn .r^pro;>ng taWos 6 and 7 no!e 9vM the resufts dr e 
some^vhat atfocfcd by the beC shaped dtsin»>utcn ol 
scores on the PSAT/NMSOT The farther the PSATfNMSQT 
scwe c from t^e m<k#o ol the d<str»botoo. the fewer sto 

w th PSAT,^^.^4SQT scores o< 68 to 72 there wil bo many 
more sccfw o* 68 than 72, Smiarfy among those 
scores Of 28 to 32 there w« bo rr^any more w.th 32 than 
win 28 

TaWcs 8 ar^d 9 show the percentages of students w^h 
junor year SAT scores w^h*» various ranges v^ho subso^ 
Querwfy earned senor year SAT scores at varo«s «ve<$ 
TaWe 8 f e-ers to verba) scores. TsWe 9 re;cf s to matf>emat(- 
cai scores The ■njcrpretatcn of these tables « simiar to 
thafofTab!es6and7 For exampto. the number Tm the 
top row of TaWe 8 means that among the students w^ 
{unof year SAT verbal scores ot 680 to 720. approx^natefy 
f percent earned SAT verbal scores 0(550 to 590 Kithef 
sencr year. The coJumn at the r«ht of each tabte shows 
the average senor year SAT score fo* students mnor 
Vfrar SAT scores at each specified tevol 

The spread ol sencr year scores for the students tf» each 
junof year score category is substant^d though smaier 
than wtvT the categories are based on PSat/NMSOT 
scores Students with juocf year scores in the same 50- 
po.nt mtcrva! may have senwycar scores 200 or more 
P&nts apart Aga^i the statistics m TaWc 8 and 9 reflect 
erects of 8m? be^ shaped score d«r4xAoa The hohest 
category -xyudes many more scores ol 680 than of 720 
2^*°*^ «'^y mciudes more scores of 320 than of 

At a speofc high schooJ. the average change in scores 
f« students who r<^)ea a test may dJfcr sobstant^ from 
the average change of the natcnjrf group simi^ because 
ol the samp{rf>g error present n smaS sampfes 

fn the caseof an Achicvemen lest, whch « des-gncd to 
measure a students Itnowtedge of a subject, score .rv 
creases may result because the student has stud«d the 
sub.'cct for anotr^er semester or two 



Using ATP Scores 

The foeowmg suggestions, wheh are far from exhaus- 
tive are ■ntended to st#hUato ideas that w« yiekJ the great- 
est benef.t from the reported data (See Appendix C 
Gu<Jef nes on the Uscsol CoBege Board Test Scores and 
Related Data ) 



SAT Scores 

When scores or other data are used for setoctw (to ac- 
cept a student for adm-ssco or topermi* entry mto a parte 
ular course) it is important that they be vafidated 
penodca*/ most apprcpnatefy through a vahdiy studsi to 
insure that they predct the expected outcome at a tevel 
atxeptabe for the tfis«utco's paricUar purpose Every 
three years .s a generaify accepted standard A vaWity 
study also prov<)es the rc(ativo weight that should be grvcn 




119 



116 



to scores and «hef data « p.'cdci/>g how a pai tct/af sJi#- 
Oefti wi3 pe*fOf m f<y example tho we^i to be g ven SAT 
saxes arxJ the hgh school record lo pfcdci g'ade poiM 
avcfa^e (GPA) the vvcp^ts lo be own Ach^vement Test 
scores SAT scores, and high school record for adrT>s$<yis 
purposes. arxJ the level oJ Ach<evefT>cr^ Tesi or TSWE 
scores that best predcts acceptable pertormanco m tN> 
partcUar course 

Some students may be anxcus about ther SAT scores 
because they ccn/d fx>t ar^wor ai or most ol tho ten ques- 
tofks corrccUy Tho dala m Table 2 (page 18) may reassure 
Ihem For example, on the average, studer^ answer only 
about h^ of the SAT verbal que$t«r« cor recfiy resUtng m 
a score »n the 4t0 to 440 range 

Son* students may t>e d^couraged by wttat they corv 
Side* tow scores Students terxJ to evaluate the>i scores m 
the be<>et P^at the average Cofoge Board test score « 500 
A review ol Tat^ tO (page 22} with the student ooy be 
helpful potr«<^ out that an SAT ver b^ score <or ex^ple. 
of 500 « at the 87th percent4e ol as high schooJ senors 
and that a 370 score approx/natesthemed'an score Hthe 
students concern centers on the d^'crefce between h-s or 
her scores and some other students thedatainTable2oo 
the standard error ol the d-t^-ence can «xJcato J the diler 
ence ts HteJy to be 'eai or simply the res«^ ot the impreci- 
sion of the testing process Fcr example two 
SAT mathemaical sco'es would have to dtfier by more 
than72po<nts(1 5x thestarxJardoterrorofthe difereocc) 
for one to t>e reasoruWy contxlcnt that the h^hef score 
represer^ts owe highly developed mathematcai apt tude 

Students sometimes repeat the SAT in the hopfl ol im- 
proving ther score and mey wonder whch score wfl be 
used Most admiSM)nso.ficefs consider aitt^e scores in a 
student's report However some adnvssoni ottcers prcier 
to give students credit for their best performances and use 
the highest scores TheSudent who takes the SAT two or 
three tines w> probat))y recetve at least one score higher 





than the score ol the cquaiy capable student wt>o takes i 
only once W^cn admiss»or*s oftcers use onfy the highest 
scores, the stjdent tt^o can attord to take the test o.iiy 
ooce m-^t b>3 at a disadvantage compared wth the stu- 
dent wtx> has taken 9\e test more than once 

Some admssons offcers prefer to use a sluderM s most 
recent SAT scores This choice may b« less subject to error 
of measurement than usmg th* Hudonts highest scores 
The most recent scores n«y better reject a students cur 
rent aWty 

Other adnvssions offcers calotte an average of aH the 
students SAT vert-a! scores and an average ol ai tneSAT 
nvatfvemat<:5ri scr.es Th« method may be 9^ most equ^ta 
bie ot the thre*; il scores span a short percd it may be 
h^pfut to cofivaro SAT scores the h>gh school record 
UrxrtuaSy h^gh SAT scores and weak high school reco'ds 
may indcate able students who have not applied therrv 
selves tn high school Very low scores and strorg hi^ 
school records may mdcato students who work hard and 
achieve through perseverance Other data such as 
teacher and counselor recommendations may be needed 
to assess accuratefy students' read ncss for a certain cot- 
tege 

The booWet Taking the SAT conta ns a sample test an- 
svrers and a tatjie g ving percentages of a random sample 
of students correctJy answering each queston on the May 
1983SAT andthe June 1981 TSWe The booWet also con- 
tains scoring instruct ons and can be used to ga n a deafer 
idea of wtiat is tested and of what a score represents 




TSWE Scores 

Stud OS riave tiecn conducted that defTX>nstrale the d 
fectiveness of usng TSWE scores m cofege placement 



120 



117 





W«h the cooperat«n ol 15 msMutons. freshnwi 
course grades and TSWE scores were oblmd tor over 
4000studoci$ A conH>arison dihe grades ml aoorae 
sN)wed a substanM relaKxtthp. with over 90 pcroeni 0! 
those scoring n the hghest score range or) Jh© TSWE hr/- 
«ng an A Of B n freshman English courses. 

A second study conceniraled on the reiatan^ be- 
'.rueen d<rec( measures ol wrAng abMy (essay^ tnj nA- 
rect measures (mutt»ptecho«e tests) like the TSvyE 

Table 14 on page 25 shows the relatonshn between 
TSWE scores and the percentages ol students wr«ng 
atjove-average essays before mstnjction. a the students 
wHh the highest scores (60+). 85 percent wrote above 
average essays at the begmrwg ol an CngMi wrtwi 
course ' 

These siudes and other data obtamodntheactrmeVA- 
lon ol the test confirm the apprcpraleness ol the TSWE « 
terms of df^ci^ arxJ ol disarwwt»)g powec The test 



meets tw purpoae tor whch « was des^Fwd. that 6. to 
Ktartfy Aidenis «*K> rriigN banefll from «dd*on« or spe- 
oafaed rakuotan n standard wrften Engh^ Colleges 
ihoW develop nriution-apeofic piaoemer4 aptf cations 
ol tne TSWE (Soo Metfwds of Irnfikuncobna Cofeoe 
f fccerHem and Extnpbvt Progwns ) 



Achtevcment Test Scores 

Actwvement 'fasts are cumcukim based, but they are 
•idepflndeni ol particUartextbooks or methods ol «stnjc- 
inn. They are dosgnod to assess outcomes ol courses 
thai stodenb have ta^ recentfy tt a student compMes a 
b«logy oone n tw tenth grade but does not the 
Bnlogy Admmeni fest n that subiect untd the tweitlh 



118 



grade th« tme tag ma> ptaco the student at a disadvan- 
tage H a shxJent take* the Achievement Test «Chemisuy 
before cocnpfelng the course ot stud^t the student « also 
at a d>sa<tvantage it a student takes the same Ach«ve- 
ment Test more than once. tt>e score (or the test t^en dos- 
est to the completion o( the course would presumatjfy be 
more indc«t<ve ol the highest achievement level Mta^^ 

In u»ng «n Achiovwnent Test score tef placement, corv 
stderaton should be g«ven to the number o( years of stud/ 
m the subiect and the tevetol courses taken For example, 
high school students may have studied a language for 61- 
(erent periods o> time Most students vvt>o take an Ach«ve- 
ment m a Icreign language choose to do so durmg the 
thrd or fourVi year ol language study Others. hOA*vet. 
take the test dwiog their second year o( study or even lhe< 
fiflh Cand<Me$ who take ATP forest langvMge examma- 
t«ns are asked to suppty cen»n mlormation about their 
trammg and experience « the language Normative (ttia 
that provide distnbutions of test scores evned by students 
who have taken two. three and (ou years of study useM 
lor evaluatog student performance on the basis ol years o( 
oourstworK can be obtained by ■rnibOQ Coiege Board 
ATP (see inside front cover) 

n the differenca between the scores earned by two d<ner 
ent people exceeds 1 5 times the standvd error of the dif 
lerence for the test (see Table 5 on page 20). one can be 
reasonably confident that the higher score reveais greater^ 
abMy or achievement as measured by the lest A d^fer^ 
ence of fewer than 66 pomts m the scores 61 two students 
on the Biology Achievement Test, (or example should not 
be considered sigmfcanl 

The corriparoon of scores earned by two students on 
Achievoment Tests m deferent subjects « a more ccmplh 
cated matter Although every Ach«vemeni Test score s tt- 
ported on the same 200 to 600 sc«de. individual scores 
earned on diKerent Achievement Tests are only roughly 
comparable It « best to avoid comparing scores earned 
by different students on d^ereni Achievement Tests 




terpret scores on the TSWE or on tt>e verbal sectons o( the 
SAT For example, it a students TOEFt. scores aro k>w and 
the score on the verbal soctons of the SAT a also low 4 
may be mJerred that performance on the verbal sections ol 
the SAT was probably affected by the student s deficien- 
cies « Engii^ For furtfier niormation aboot the relation- 
ship t>ctween language proficiency as measured by 
TOEFL and performance on tho SATverb^ sectons arxj 
tf»e TSWE see TOBFL Resea/cft Report 3 For mlormaton 
on TOEFi test dates and center kxatons. see TOEPi Test 
Center Ret^ence t«f 

Counselors and admiss>ons off^ers tfK>uid atso be tfert 
to the problems of students from coontr les other than the 
Urwted Slates vrfx) ^>eak excellent Engi^h but \ftho may be 
at a disadvantage txcause of the<r ur^famH^'ity with testog 
methods m the United States and because ATP tests natu- 
raty reflect a United States cuMitdi background 



Scores of Students for Whom 
Enslish Is a Second Lansuase 

If English s not the students first language, exerc<se 
judgment m estimating how much a Wreted facility the 
language may nave affected grades and test scores Al- 
though no dear-cut pattern holds lor students from every 
part of the world students from oi/tside tho Umted Slates 
generally do better on the rr^alhemalica) questions of the 
SAT and on Achievement Tests m mathematics and ^ sci- 
ences than they do onthe veft>al (juestions ol the SAT the 
TSWE. and the Achievement Test m English Compo$>ton 
Performance on the foreign language Achievemeru Test 
speolc to the students native language may reflect factors 
other man classroom achievement Although (^tty m thai 
language is accurately assessed ttie Ach<evement Test 
score rruiy not be a igruficani predcior ol overaH aca 
derrwc performance 

The Test of English as a Foreign language {TOEFl} was 
designed to help assess a loreionborn students gra^ ol 
Engfssh Performance on the Tc5£F|, may serve to help m- 



Scores of Minority Students 

The College Board makes every attempt to ensure that 
lest content •$ as (a^r as possible to ai groups A sens<tvity 
review cormvttee has drawn up guide><ne$ for the types of 
mmority relevant content to be ■r>ckxJed m the SAT Th« 
committee also revtev« tests to elirr^nate questons that de- 
pend on words that may have d>lferftf>t meanings for var^ 
ous groups Preliminary review helps to eliminate ?ie 
mck^sion of t>tased questons mto final copies ol the test 
Stalistcal analyses are routinely performed to identify 
questions or types of qyestcns on whch performance df 
lers for d>Hef ent groups of students 

Adm<ss«n$ officers have found < a wtse polcy to ensure 
that members of minority groups are not excluded on the 
bas^ of test scores alone Other (actors such as hgh 
school grades, stror)g motivaton. aryj maturity purpose 
can mdcate a potential for success 

Va^ty siud-es conducted by mdivOual coBeges indi- 
cate that the test scores usu^ are very usetii m predct 
ing freshman grade pant averages for minorities Each 
college ts encouraged to conduct its own validly study tor 
minority students 

25 



122 



119 



Scores of Students with Handk:aps 

Th« fT^jssaoe -NoftflantJjrd Admnesfrjiw- on a 
andanaslef*KneMioat«tdai8»idte» uitwimdM 
took the tea m a nonstandard Adrrwusbatoa 

Th© po/pose erf spe<Ml test ifTanoomcrts B 10 aiBnwt 10 
m<«Twe tho Knpact * an rtdMduiri twidcK) *^ tw M 
s4uaton so that students can domon»Me tier aoKtomc 

The ind<vKiual c«curnstanc«^ tMl raqurt afnari tost 
rangements are so drver^ Vmi <ik; Colege Bowd • no( 
ab^ to prov<te mc^nngful witvpntw dM lor acora* 
earned m noASJarv:afd tOmnslt/ttoni. TTwloW nwnb« 
ot students w<h handcaps wtx> have Men spacMri «S- 
tcns oJ tho SAT « smal, and nurrfcer wt» ^tjnd »w 
oven co«oo« « smaief sti, so the correWnm tM««en 
test scof c$ af>d the ff St year averages ol tiese Mudw«i 
have not been e$tabi«hed The norrm from sMwd ad> 
rrwvstf atons pubi<stH!d in Ihs Gcxfe may be a umM refer 
once but the usual caution thai test scores tfuUU be 
consODf ed onfy one tactof « the assessfTw< ol a tfcjder** 
academe potential « espeoaVy apphciUe when ffw 
scores o) yudem wxh handc^ bem rtemrqiad 
geo ATP Servcos lor Handcwd ^ideS M^SS 
ftv Co<x«e*y$ aMAdrnissww 0«cefs andWbrriM^M 
Studcftts with Specal Needs ) 



Scores of Adult Students 

Scores ffor" tests taken more than five yews ago we 
probatfy not oood *xJcalors ol a person^currert *iily 10 
docolegework A person v^hose worfc roqures vntrt or 
rr-aihOTjica: abit«s or who reads widely aiiijd««ie. 
pcodenpy rnay bo better quaMied tor oolege work now 
tt^an at h<gh school graduaton AAerhveyews^isreoom- 
n^ended ^t sttx^ents take the lests agan r^wr tw) rtftf 
on the oW scores 

Admrsscns ottcersmay fmd thai ATP score reoons r* 
c«^ved for aduit appKants are best ev^Ualed n tm ol 
most recent exper^nces. rrttOvabon wkJ fw 
contrtbct<ons ot Ua experience are n^iortwil corsder 
at>on$ n prcdctng coflege penoriTwue lor adiA HUM. 

TiM) mean scores lor 3 1. 056 adiAs (ines 20 WKl overt 
who took the SAT durmg the 1961^ mM yev we 
Shown t\ Tab>e 15 On the average, aduks ewned lower 
mathematcal scores than dKj oologeiiound serKn How- 
ovcr^ aArts from 25 to 39 years ol age ewnod hrtier aver 
age verbal scores than colegebound smors. As w«) 
coBeg*bound senors. aduK inan scored on tm 
mathemaical porion ol (he test ttm ltd KM wararv 




Percentiles 

^ wwod t\ c$oiato out rather 
K addtatiaUAxmaton abou a student^ Icsi scaes 
wM they nwan n oonpvGon w«h 
OErts For an explanation ol pcrccntJo ranks and ol the 
rc*orc oLegrou p«>seepaQC58and9 The pcxccneic ranks 
- retorcnoo groups - lor nstanco men and 

women - tfKxid no« be compared llowovec the sc^ 
aoores lor tm dAoreni groups on iho same lest can be 



Using Predictive 
Validity Studies 



Prette wev^cHygidcatesatest^eftoctivcnoss^pro^ 
d onga afcajewftpertarmmoe Ths. fundament^ « the 
yP"^°'^>»ATP tests-io servo as predetors ol aca^ 
*^P5^ormance in cologc Such *»lorfnaton has 
y P! ? r-t?fP* ^ to admcscns olhcors who can use test 
^ a ca de m e aspects ol the second- 
■ry school teoord b prodct a students chances ol aca^ 
Oomc sjQoesa n oologe. 

NARjerts «^ do poorly on the tost do poorly tf) coRogo 
■nd those »*w do wel on the lest do wel « ociogo. the 
tastasadtohoveahighpredctvovahdcy Thus the siatis 
^g' **^ ****?' "*"^**^ f» degree ol assooaton 
oeiween the preddors (test scores and h^h school roc- 
*** gf oron (grades n a partctiter coRogo 
cow*;* ©"wral fceshman average, or a tour year aver 
age) T hgdyyee ol a ssocia ten a expressed as a corroia- 
ton or vihMy coelBoonl «4k»o valuos range theoretcalv 
Irom -1 OOlo ♦ 1 00 
T*lo 16 preaerts a summary ol al vahdiy studes con^ 
»«WS^ yfakMf Study Sorvee that were do- 
sanedto preAJ keshman grade port average, usmg 
SATacoresand hgh school record (rank or average) Al 
giK te ol the »4« *e Ireshman dass. ol mato trcshmeg ol 
tonvw feeshmen. and ol freshmen omorog ar»y ol six so^ 
tedted colego (wo*jms are fKludod (tl a coRogo con^ 
aciBd more than one shJdy ol any typo, onfy the most 
reoortonesnduded) 

For the 68S ooleges tut stud«d thor whole heshman 
Sf^Tl**' pwoentte modan, and tOih porcontJo 
idly oocAomls were almosi the same lor the SAT verbid 
■naSATrn*em«ical scores. That s. the SAT verbal cor 
leUuitMsabow S2 lor 10 percent ol the cdcgcs, be- 
hwon^and S2 lor 40 percent. between 21 and 36 lor 
40 Deroert. and tvkim pi w m CAT.t.^i 
oorreliton was above M kit to"pc«»niVthe coa^ 
DMween^and 50 lor 40 percent, between 20 and 35 
tor 40 percent, and below 70 lor 10pc«cer< 

IJ**? 0* ^ focofd e typcal^ 
h<^ •«« the vaWty ol »ie opUrv*y weighted corn^ 
ton 01 SAT scores. For example lor al ft cshmca the me- 
cfcan correlahons tor h^h school record arxj for the 
^Sr^!!!^^*^ combnatcn ol SAT scores were 48 
■nd 4?. respecftvoly For mates, they were 45 and 39 
For lomalos. they wore 50and 46 



123 



^^5 7^; \ 



The vai<Jtv ot the opt*naly wcghted combwwton o« 
h-gh school rcccxd arxl SAT scwos « usually hgher than 
that ol eihw ^ high school rccofd of SAT scores sepa- 
rately Fw cMfTpic fw as tfeshmen usmg thcconbrnaton 
of h^h schooi record and SAT saxes fa«od the nie<ian 
«xfe<aton 13 ovcf SAT scores and 08 over h.gh schooJ 
record (Because the rrxxi^n corrcJatons m Table J 6 are 
roofxJcd the d«erer<» appears to bo 07 ) Although such 
cnprovCfTients may seem sn\a«, they repfesent an atfCe 
cable Kx^case m the accuracy ol academe predctcn 

St« grcalcf accuracy tfi predctng coflcgo performance 
can oHeo be obtd'rxxJ by using an appicants Achieve- 
rnent Test scores ether individually or averaged m aJd^ 
ton to h^ school record and SAT scores The average 
increase m corrclaton ts between 0? and 03 bringing {he 
total ATP test score increase to 10 1 1 over high school 
record 

The ranges ol hig»i school grades and test sco»c« o> 
those w.ho enroi m any giver^ college arc smaHer »>an 
those of sa studerMs go-ng to coaego Appicants with low 
grades or scores arc often not acc(^)tcd In add ton stu 
dents W'»h c tr^r unusua^y hgh or low grades or secret, tor 
a gtven college select themselves out by not applying As 
the varawty among cnroii.ng ffcshmcn on grades and 
scores rs reduced from that Ahch woukl have been ob- 
scr\.ed if the coeege and the siutionts had not used grades 



and scores 10 deocJe who wfl enroll at the college thecor 
roiatcn coetfc>ent$ are reduced In the extreme case J ai 
students at a cd'ege have the same grades or scores no 
predcton « poss-bie Because there « even less varo w^iy 
W'thm a college curriculum than tor an entJe freshman 
c*ass vabdiy coeffoenis by currcuium are even more 
restrcted 

However vthenever the number of students « sut^ 
c«niiy large cot^>gcs do separate studes by cd 
lege currciyum (espectaRy because courses may dffer 
and grades may not be comparable) sex ethnc group 
and other r^evant subgroups of students Separate stud 
les woukj show whether use of grades and 'est scores « 
fa r tor each subgroup and whether use of separate predc 
ton equat ons m^ght promole greater equ4y 

The use of statiStcat predctons based on t^c h^gh 
school record and w-hatever test scores arc avaJatfc y ves 
what « theorct<^V the best possbie rfxJcaton o» the stu 
dents co»ege grade record that can be made from these 
data W4h th-s in'ormaton avaJatifo *or a gven appicant 
the adm<ssions oftcer is free to devote more time to con 
S<Jortng the school s recommerxiltons and other in'orma 
ton about the a[>(^<ujnt s persorvji qua! fcatons 

For additional rfiJormaton see Chapter 8 ot Jhc Cofiege 
Board Technca/ HjndtxxiK for fhe Schc^%tK Aptitude Test 
and Ach)evsfr>ert TcsfS 



27 



121 



Appendix A. Tables and 
Their Sources of Data 



Oju •^ i*^ atn^ Kvn «.» tor nimww^ « 

MNo^i;^' *^ 

♦*«0»* HJ ' *l»lftii. IMO 

*. •<'«tO'r#«or« « »T cotnpontm m r« TSWt to ■ tyt>c« 

^ t > w M««i to f 

^*M'^ in(>^l^ tMdr MOrw •» OCtoft «i «MnW to I M« « 

Ow*^ two 0« 891 000 |**J^ 

*^ " *^ • 

0«w bjj«3 c« *6w »i (CO Bufir* -hp i« 



« KXA r« SAT r r« i(lrrg <« 



N (to tgtru* tKXn] Mncrt M to 



r rw 0^ KT^ graduiwxg otaH M )W M Mr on* M <«$AT 

r«KWOttor«SATcwwMi.»Mto.y«10n*Tr«ow«<«»« PwcMVrwM 
n 3 !♦ »»W •»« *e* • tow* «»»w»»Ort d ?• PSAIIMUSQI r Owtw 

s^r^!^**^** ****** 



» TWrt Kow (to ooligatDkn) 
W'Kiwww^nMftirtrr 000 h^f. wool •udr«*f«> 

MIt It. KwoarM A(fuMm»r« 1m (to H»i OoIto»taM«) 

^ «»W by NOT WXltUCto* •»« 

n^tound riM Sw r« Tiuit 10 9^ to ■ dMCrvKn <« r« 

to Wl 001^ t<k«l ,(K,„a » t»w *^ b. 

tcVM «MinK> « Pw lm« ?« VMrs to* r« Mh«A «CM^g 
***** ^ ***** * tWoWMrfMua, l«0<«r«to««WM««*«f 

ICMM A ■ vnri IMm VcM* TV* umftt if «udM f«Wr b« arWrMctfy 

-< »«m »!• <i*nw>(to rwjrt ^ f, e«op«no ^^ 



M to MM MM r *M 1 



0« Vt tMM OA (tout )7S 000 MUOM M^O K> 



DW to f*« « bMM on «toiJ«»n«»» 31 «« 20 <f w 01 
Mlv«^>iookr«SAibM«MnOcttbv«««««iJur>««M2 

••'•Ml^W^wi**^ coi«««^ uwng SAT nvM »Vi tffod (wewj 

rti» <W 0« r«» Wfc* 6««J ooiWM r f» AfF V»d»y SMV 
S»v«rto CCtapM aincU»^ «^ .«l«ng 

#1, d M »K« (WfCJur, *^ «a«>M (» • cotegt 0<>% 

XMg** Oor«tMino CUCai^t^t* <Kim»rm 1 1 «« ffw «wt tFOie d 



122 



Appendix B. Testing Terms 

CimlMlin: r>« wxMrn ♦» m«o *rmM* or lu* •§ ht-^ trd 

tnd tr*^ pao(M 1^ on or^ vnr ju* (Mt ID bt on r« 

CoriMMn dOM net •*C'r CauM bur 0^ HKiC>««n 

C«wil<iM lllHIiliiit r«om>c<ffW) ro»»>of »»{ifw««vj ro d»gr*«o> r«>«»ofv 

C«s »9« - 1 00 «»*<rg parlta n*oifva OQwUMory nf<x<r 
C«>ngxecO«T«l«»on.loalOO »g^»«^Qp^«i>^^^lCl>*»< C O^tU^^ ) n » tom*n0« 
r« oortMoo oraffoM tM*>t*n r<0^ amj tcx • orkv o» •>•• 
1 1 00, tooMv « tnfft coJa (ttaa h« w^wui trrw But, < 

Meat n>rr«i*i<on cocfV^nti at Ml loorn tncJ '^^^^ikm Of acaOtrx tucctM m 
•Or««r«r« b«N>««n fa^o and 1 1 00 KnooMovottAMtvcUtiarveriO^ 
tnablM en« 10 pt«M h« 0< ^t' «uniV«0 on r« ot^V v«4M ^nciattK^y 
bJI grMMr JOCtf Ky I CC^KifCn «nrt nro Th» N<f>r r» COf*- 

r» iMl rn I rr iT*r "Ti '1 r I I i tt-t- — n ~"r '^"^ "-' — 
el a M on t oy^'xxxj km wx] oy^cwnatM •or v»«kof« r <}Ack#y an^ong 

an CM trd • O*** « '*y»<J*« <yi r* mi m««>« Sor^rrm a 
r vidom r>a« eanctaa m tat«t amt old tym of r« Ml o< a rapr» 

tarukv* %irxtt H rnvriri fror^ an 0*0 tar<rt m*r ba addad 10 a naw mrm 
ctt*»»fig ra pa'igmanca o» paO yd o^ffan cand 3 ai n on r» old mawr j) ra 
dAaranca ria<» (Miry itvaH can b* cciwnivd ard ra main and undird 4av* 
»onc(•am''^a4aon^ar«*lorm«an^*a(>4M•4l»rt•actr<««l•>anf* Thut, 
A a tin(M cvia'auy^ a io^*Ua * dar^ rat rr'^ «c<»at mada on rw naw 
10 ba t<acaa on ra leaia <«aa r* a^y iv ScaArv; ard caiCyaM r« Kcrn en r« 
naw ity^ to t«y va OcrvwiM 10 tXM ok) io>n«, ragtrdtw 91 d«vraiva« 
A DAom, NcauM luCMwva ter«^* e> r« m> ara «y«ruc«ai) kroAtaog* 
<t(Xii ra iWx^Y tavM •o' •♦Ncr> fqi>ri^ fv* cc«xfc-«t» r>a •OM'!'''*^ •« 
notvarngraai 

f ii»<awcT >ljw»y«aa. a mtoaton « toprw njw h^y> » Oooio 
ir<0«ing ra oy*** e» 'xSOu** who o«*" aacn >cwf or w^om iowai W 
aachKcartMrvai r^aquancv a«»Ciut«n«a'a uwd to oa)«rnv«utM 

Mrankt 

MaM af ai W iiiM « i t w aan. r« awaga 

IMaN:r<aicorabatcw«<i^V>c>a«^etr<acaMtMa«ooraMCu»^ * 
ra a«ff«</>on el looat •chwm by V« p>M«nc« ota la* ab«»art caae* c' Mna 
»»«»<\*va rai-w -'maybaab*««'tumi«wyOaKr(itcnetraVo«X>r<anria 
n«an yr<aO«r4 «tv<«vi>«^c rai'^MMnandmaaniaJbawnonliMrihCJl 
tha a«o jy <3a«i»v«in ra iC»» pe*c*n,*» 

Nanw. a iUHWai o*ia<itcn e> r<a tariotrnarca on a Mr et a M* <)ii^ 
raityvMasara'aranca *r » ^tuQ* nr'icr'»aric« o( or* rt>vOu*i 
rai«>« Moal ny>ni tablet yv>* 'n (jMcancyq ord*' •«r«ui WW wc*M 
*\3 r<a vtitjtrtMjt di t*x>* ^ ra '«'«'»^ yi-JC *^ tccart taio* , »ctyi 

lavai Tna k't>i»v>g <\>v <x.* 4 ♦ or«C*ni]ucMy(MM"*iin«hMi^Or Via 

eo«notra« «*< ra lataftrica qtcmd 



0»lil> H m ri - ra aoota acM»y ag»*.aa by a c>araon»*iio a Ml «aoor»« 
arad Vba r« k/n e> rw nOv^M^ rua loora piua r<a a»w raoduoad by M 
I' HW itii a i»> 6 *>i o>ra Ml rh«anweanbaa«<arpoa*wav>^aoaii««ti)ri«aA 
oeMwdiooraciAbah^ortowartMnraruatoo'a (Saah«aoo^) 

fa««<aM ««Mc r« paraaniet leeraa Ma AMuicn rai ara kM« tw^ a MTVU' 
•ar o(ttf>ad Mora T»afa«»>awiQaooraaaraair» »aria i a»* o« ^tf^ 

Am aaafac ra n^yvv^ oi oor'aci faaconiaa ^m^a a fracMn oi ra ^cor^act 'a 
iCo»M< TharaaKoraaoonnanadioaactfadiooraiorraportng 

HaMMr *^ V • M maaatm c onaawrtif rat «. r« a>iint 10 
«<«vy I par»or> ffpaaanq ra MIor ia»/^ in H km^M *>oJ<t 10 gm ra 
tartaioora aw/«iro M pracae* MaMi no ^aawea MabMy • uaurty at 
PiM J i»acpii'i> u iccii»cia'«and«co>wati>.'»»*iiior«a c e*'< *a ntf 
a Ml a4h a partac9y paraM tarm e> *<a tar« Bit 

Tha «y«*C« et 'atubity o«i ba *«a«M at loaoM H«grw MO yardatiAt. ena 
mada et wood and ra orar et a MP* M • atiiiaa V waiM^a ObvvuMy ra 
woodin yvdaic* •« ba inora rakatia bacauw I w* 0M« iw* oonMMPi * 
an ottaci • rnatftfad M<f« « rapaaiacry Soma anor • «««duoad m at maa»n 
war«>, whariar Mil or wac^w\a»eowaaa^»^^w>n^^ by rar<partaq'*ifci« | tf 
ra inaaak^*^ *>*\«ftar4 

naacalna: na«gn^ ra tywm tor viralorming raai aooraa « (aportad toow 
lor a itai o> wino (ro^^am Atfv««mar« IM «<«ta raacaitd tor ra loaowra 
yavi 1«S»rrow^|»7? 1»» i»r« v«d 1*r«rrOu«MMO 

•aalkif: a maaiia e> dafirw\0 a ayaiam lor vanatarinng *a« aooraa to i aoaiad 
acorat lor a Ml o< aM»ng c^*^ 

t iawOi «ai»<a«> i! « maia>w el ra tctad or a«iart ef vanabMy el a a«i ef 
aoorat arn/td ra* irwwv n« Mndard daMton rallactt ra dagraa e( rwrwotna 
«yetraOFO««a4hpaaMoiioravaraM«>4uMMn Thai*. raMaradapar^ 
son e> tcoraa. ra wnaHar <*« ba ra nandM d*<naaon 

mmtm4 mnt al »a <W»ii>aa . an rdoaon el ra artrt 10 ra d^i^ 
anoa baiwaan ra Koraa el MO («ocM enra HI** MIor ra aooraael ena paraon 
enhM)(Wiar«niM(imayfacrataniarwdwaiora«mMb«yeiraiaal Tha«a» 
^ Ca r«aacir>aUy cty«(iani ral fia Nsr* aova 'acraaami graaMr abMy or 
acr>a»arwan<aiwaiat»adbyraMl«rad*iranoaba>»»aafti«»o»ooraaa»caadl 
1 9 imat ra aundard arw el ra dMtranoa lor ra Mt 

•uniBftf anw a( awMHraManl: an oda< el ra a«ttni to «>n^ tMina ofr 
laoad Kor at diNf kom ra* rua aooraa Oaipiaaa J k aooraunwtf raiaal 
iraarr* aitandino ona tuvidard abova and baKi* ra rvN aoora <Hl nluda 
U par»nt el canMHac euanad aooraa Ufdat^ <mr,^ H a«wor>o 
ard ar*cn abova and bakM ra rua aoora •« nduda M paroait el ra eamXMH 
otKKW) aooraa 

"ftwi taara: • ^yt<ora«cjl owvw nocabnQ wtNal ai-> loor* on t M 

w(Mdba«rara«aranoar*v**educadbyraK'«Mtmoprooaaa Ksroocrtel 
ai ra hypoawtcil awarao* el an irfinaa fWftar el oUarad aooraa a«> ra a*ad el 

cacKarafnowad 

>M«ir an ^idcaton el ra aittrl to MiMyi a Mf or oVw maaakM doaa ra |06 tor 
•^cn I *aa famdad Thara ara aavwtf knda el v*4tv ^raacM vaUiy « ra 
aiiMriioa^Mii»>aa.»v«<aianc« vaaMtocradciaoraarcnvanattaiutfi 
a* yadaf o> (a(x#y lav^ VMOty « »>craatad at a oorraMion eo^ioani baiwaan 
rafradoor var-abia au» a» ra SAT tooraa and ra oaanon vanatta ViKMy 
co*«««r*i. Ilka il oorraiaMft ooaii«ara. ara haav«y (A^ncad Dy raanriio 
wr>>cf< ra 'v>v<dualt «ij(>ad ara loraad out on ra pradcK* maaat/a and on ra 
criwr«n maaM/a irarangaelAT^aooraaaraatfvnarvyaaigradaia'aarctad, 
ra oofaiaMin b amaan ra M «4 ba irnaiar vi VKM ra rangt el Ktfai (or 
WnwO audar« • HmcM thi*ri r*n ral lor ra •PC*car< 7*j0 

Tr«ra*ora a v*ct> ooa*oaf< baaa d on aowaad aiudana a* i/oaraHir>aia ra 
utaU^aai el ra »M aooraa at an AO *> tHaong wTtong apcKantt 

Vaf<abiatorarr Miacoraa. ncAxVigtipac«>« rah^achocf avaraoa alto 
nava pradova >«4>v Tha NgiMi »ad<c«va vaNMy « oanarafy Obiamad by 
ommgMiaco'aaandn^tcftooiciui'antaroradaavariQa »ia>ghM m aoooro 
anca ra tat<.<i( el a fxAc^a m^^etusm anaiyt* 



29 




123 



Appendix C. Guidelines on 
the Uses of Collese Board 
Test Scores and Related Data 

T-* Ct«*9e ScLTvJ w vtotrfS and <tartuM .of, , 
C/^fr^crs on (A« c* CoM^p aowt^ b«r Seem *«i A 

^1 r-v fv.,* Th^ g>iOtfm «» r* aytM»« r* Sum 

'^»y* « ii^^cc' r» *:» r* ifv««i «5« «, »«v e4* iftirw t»». 



»cn» B.M Mr.* -a tOTOl »^-v»«^->»^a 

Coricse Board should 

Ky*^ Kyxv(r^ rcKuhons an,) agcrces abeui mmt « tnti •mim wr 

trrvxwy, and cap<UM Ma t» ^fpm n W 

C«* * P*» ,M ji^yi ^v<«t 

Assure .m:„<vj, o» ft »m» and f«lJMd«r»«»ty m*-«rtno kw 
-*«»^a^».,U/.,«; ana ^ar* tor* f„„«^ MWw>3 ^ 

» un»,si*v»oQ and UB., «fv«« TO)^ 
a>3 fw^srfyvj a F vs., nMA<>» a,,),*^ 

* «!anc^ <^ kno*« 5, ,«cr W ^ »^ ..a,,*^ „ „ 



*»v^» rt a«ror''«w«st anj 'a#nesi« a « 
'atxTy a.CH:<<^ wcorysnrya^pc 

cc«?>«oo.^ a-O r*v*» ^ r* w» and nrcmy, V/v«vi <* C^oAj* 

* VtluCarcnA t^t/nyn 

v*y <K<« e« y* P«ua^*> tirc^^ 

' ^xurcciiMtfyaruaaciOMCv 

Ma.(v>*n «<>gK» pwMXrft lor [.OMCM^ ty^xv*' ni>«t* W W>m 
f«<*wng ,n.O''»»*'<ys -^ai » derwy Orty ^ t». aXWrt 

grc^ c< '^.M^^ or aa*^ Wwr»«o o»t<«i^ 



M»n!*.n «<ya v* C)^<Oijr«f ttirfyrq r« « 
•on aco^acy anj in» r«<<ov)ng irT c«c » cand<l»qu»«.itf a 



When CoOese iOMd teits are wed for counsellns 
purposes, couftsdori shouW 

M» rrr laai) V laM n pkxvung rw« «dt<.««yv« 

rw» tan r« oj-WwCKiitr rarp* r« f> fw twn brt rtff*^ 
l«Oia«t»W».**<y,a»«r^«.»»MC)<y«W<*»« ra(WiMw«(ne*. 



Schools, colteges, and scholarship agcndcs thM use 
Cotteg* Board test scores and other related 

information should 

•<»nal ™*i,^,fT»n, ^K,»n9 cw«r» Mer«#* fa9ar<lno ortm Ccrw 



i*y"»Mar«» r« a(>nrjs<ir» » a^ 

itf>o<* rear* •f*<yi art awfc^r yarj^no gr** 



When kwtHutions use Collese Board tests and related 
data In cofVunction with other Information for 
recmWns purposes, as m the case of the Student 
Search Servtee, they shouW. 

up«Sr«tKMT*air«r*y»«onwg r« Coap^p Soann Sum SM<7t Sir 

Vhe r* ««9r<>MfK)n ar«r rw fCvMng (upme. oonMwi asa# 
arm 7««n n as CaniMUM W W Cotesp 

c»oi<w»wt aoetearrt T«wart «*i-»*un atoj 

not»«nv*onr«(t A«Mnn »na jw^*^ r»t (Wrtun*** » pflw^ »rt*v 

c* au«wv>» am) »s* trtxv c»«» &, ^ rw 

O^fcam f«ljr«| t> Kvoai acad»^ (vov*** 

fMMvt •Jmncoi crocMuwt M (toKl»w> e r«v iMW tt Co* 



When colleges use Coltese Board tests for selectten 
purposes, the responsible oflWah or committee 
members should 



"oyam at Mvieoie^ » He toocmuvy xrod iwxwj am) c«« rtcrnahin 
*to^« are*«^ a»wM^x3 aD*v » .^iwiai* ooiiv 
-vrig rut a ocvcn^on t>«octar« • «nM ataoyt (kOo tun a 
D«d(w ^ 



I»« JDcrOfyoM tnoMwa-on oaiXVirs c< t^ypm^v^ to wtHcyt 

kXjgr{u»~^ ana wyhen mvk yoM* Won «jdar«L. o/'oiar 

ao^ - p (»,««i(iino •t>ji|j(>» (Idem p ac^^ 

V*. »>i«Mn« M MCW « curw a,o ararwrw nlcjMt f>*» a% 

i-M am) fiiao r-iiM^m o> a 1 to a^*9a 

Ua»^ am^ fronds fc» c<^Mrg r« ortocr*^ <f mi »»j 



f »o*->p ■nose *^o -u, ^. » txx*wn » »», »j rtc^^on at^ 
'clvA>'^}*',a^ ar»'<.r>j.tOoi#^»taWaard»W*r»rlur.>aonT*» 

v«i4*atwuutd ^ 



0?7 



124 



When tystcmt or groups Of collesct use Coltes 'oard 
tests for selection (admisstons) purposes, the o*^ >ci«is 
responsible for the group or system should 

Col*a coo**** r*C*0* k»<^.^su>i y*<»N Iw •*ti «X>vOu* 'Ai*«n-o 

•weed arid oTw »*ywi»«)n (COJl KV'CvVtj n »vv>»i^ r** •T**' v ^AJef 
Conduct «rCOC>nm VW« K tnv#« fir Vjrojrdl c*i 4^1 *\3 

IM #(9 4MKcr<M* OirisOcrvm Evcooon* « r»^"%*x* Vy tcvtir* tul> 
omOjcl «cro(ir«tr *y«e<n or grixp r«9tJi«V (• g v»«r, rn>* y»>i^ f 

ocitgm «fo* uTc*'* kv« md cctcru/vfy v» rcc««nu»vrt oi --OvOj* 
rcvMo^ K oonKJc »>0 i>v:)j*s posjfct* poK:** M is va»i*s< ^^'\y »♦ cv* 

W>o rtroJuCifxj O r«v»yrig »»»./sv<t<^ cxy^*-* vy<«<-< le*3 V»v CO 

vOFC)or<«<W'<Mno«<«io«:hc<o*sandujdr'ys *>r^ur\**r* nr^vci<^ 

Avoiding the Misuse of Test Scores 

»* USMI iCf»«»*#y df<*ri»*tv an rTnC* and ircfincX f-andconl 
S<ot» r» 0»C«<*>s and fMy*^ »s! vt»« •^i^tjr m*» ru«« vy^fjrt 

P«Kt<«4 ''VOfy'VJ r« u«r C ♦«»« T<Jf m^jr* Jijf^ !PM» VO" *rh V 
fjOgr^ffi «-vl -yy w<» rio «^y<^ e» in»x» iocs <>^«r» WNt-i («s A-r 



fP«lk^l^«t*»*4»»**<•»♦*^»•l/vv*»vl(JrC»0vJ*fVc^^4^ 'wjvcj"!* 
COi^lM Oft » e»OW»f * 1Cir» iixv/ »»• r«w. * >»<,s Tf^ 

tugfc* The wcfwc^ stun nrt*d jr* uf^r-^*"*^ W(<» 10 ce c< 

*n* MK/V^ IVM Mi'VA/iXC tv*. ^eicAi ATvc* itO)n<<r» u^r<^ ^<on 

O*no« t* *^ 1WI on IXise *W K gi>*<J v»«C*C^» *>>"V1 <fs <c 
W»» tVPOW^ f^v *^ *«K«*<3 to nor C*f<jt*» i*'v>og 

3 tf^ KC»W « r» «c« r'CC'l*^ <VCisonl I'ta^'fx) T« |.yr< 

* S*t o> o^yv CoiVgr fV.Mfa mi vc»« orrt rwf t^^oi on 

<iy nrt us«d « ill Of .fn^o m « rvi^hjt^ p.w« 

wiong the College Board 

*CCOC Co**}^ Poar J Rrrj^jii* C^*' t>.»C« Cf^" ' » * lis' •><> 



1 Op 



125 



Test Date Formula 



flMM m ih* cwf«r« ATF /tagfiirMton flUMh lor dMM and r*^^ 

StM MHon lor M *M in tfM MM ) TM IMI d«M «• MrmtnM by M tomi^ 




College Board Regional Offices 




l*StalM.Swt* no 3440M«rtMStfM< PbMtifiha M 19104-3301 (215)307 7600 
t Sum 60S SCO D*vi» StrMt Cv«rt»»o, R. 60201 4697 (31?) «6-1 W 
MnrEnQland 470 cMnPonjRow) Wkttham MA 02154-1982 («in89O9iS0 
South. Sut* 200 17CitCul<vtP«n(0fiy«.NC.An«nU GA 30)29 (404) &36-94«5 
Sounmwt SuiMt2?,2n EMtS«v*fl(AS]r««1.Au«tin 7X70701 (Sl2>472-0?31 
WnC Suit* 4M. 2099 G4ttw*yP1«C* SanJOM CA»$nO (40t)2S»4K)0 
St«t*70$.4l&5C««tJ«w*aAv«nu« D*frv«f COS0222 (303) 7S»-lS00 

In WvtXo Rico, mquwiM »hoyW b« Cl^KltKJ »0 

ThACoKtOt Board Banco PopuUr Svt«70i.Ha»n*v Pu*noRcoQO9t0 (909) 7S»^^ 
Utiitig Aaortss C«l8ox71i0i Sanjuan Pu«n9ftico0093$-1ini 

In Alaaka and HawaM, riquKtM tNyj<d M <Vecia<] lo th« Wtslarn orf<* «i th* Ca^tof aOdress 



ICO 



^ERJC 7A-668 0 - 89 -5 



126 



Bias in Testing 

KANCTS.OOLK UMMtf «f Am^wiI 



IW «t 1Mb iilfct ciittlcidM «f • ygji adMol 



CTfT i j/tTuiiV/ ^ T rr TT^T T ^ oMpmI Hter •fciefc *• tout rf bta y 

fiA* * > nH iii»iiii < i I n i>*i|Owwfc>*i d^ij^wb^ddiJ. 

jugj/JLuLumji lUJJtt wwdLiMw Iw artUt Mb I* prarlw M ov«r*k« of tbt 

o dto m iiiif M^ wmm W far* hi^ufciw nuiiiwifcilMrf wU rwubb^ bacn 



llowltoNMrciiCMaadcMMalaaiwtr For«' 

•ffwtt M IW uat eMihnKtlQii aad dm piktrtof 
So— rfA«»a<|ii»lii m iwiiiMii L l H iilA pncwhra «£h MMT Mfor. vtUy oad 
l«tbctarwM(yMsW*tbr»«K«d(|«atkMHrf Mb («pMidlr Am wftli hifh volow tad (rr- 
Ml biM. TW pMMy if Um ii M» Wm bM ^vM mWoa «Mid« Mb « WudMd 
• M)wl«wafMiaWc».lfcacMrti.Mldml- a cfc lii i M MW i— J nn » f fft Hii iMl n i it« >n)iad 
optn.«idKUinaf MtfaftttclWIbcMibM pwhihlyb n iiiJli H i wtoiprwMMbintet 

<Mliy. H»w««v. thi «ofk oMot bt Hid to hivt 
H^MMoy MnBM iM pwnc coMownM or 
I if tadsMl l» km iMhW mbv Imm Moctotod bUs 



• if I 

IW IM* af M bM hM ««Md lap«MM « 

■ af h bii a— all^ m rt fiin Mm ■ Hh ■ brb ^ . . ..... 

iiba.baDaMMirtiriMdTweaM«fteM. At V«A(toy B«ib r«Ai»ioi< 

MflHi liiiiKiiii <i|iil(rMal)itdyalad SluApproudm 



vMsrrr AMI ni rans 



Oaa iMM Ito la May M*H li » WMI* w 

kaatiiMtillMlM*a«pMw«'k<*ti#«<d^ ^ ^ , . . 

' ' . "ifiiilTifi h I Itfgi tffif II tan IWfaidtailaMlfwtbtMbniealKbakraf t«l* 
1U mTvIM a* 



pdWaMMMtaMi 



lag bai bMi tha vaUity tnrt ta tbt defimtivt 
«f Ml wUdatM. CranbMb (1971) 

NWnMlKfcaf 



•"^••J Aa faaAlaiMj piAcy MpMnHoat Mm itt Mwracy i lytnlt frtdlctM m iaf«wc» 
ofdtflmalfMaiaMbalaMMaftWvalalda MdclMstHla 



«»ldi*bkblbaiMarfMlbMbaabae(wa MaMiMii iliiii <l iht liwpmitM « 

1 Mil wa nM l j wiMd. Othw la iW a Mm 

|Mt«ilMloraMMOMrklutyiMiir.T«». SnuZTITt Jhj^I^yt^^ 

opyr, Ibk Mw^ tha apfi upi laUai ■ ol pMtkwkr Qi^ ini . n Md>w » . f ^ | i — mm 

Vol 3«. N*. IOl WSI'Vm AmuGMt mtamocsT • Ocrowa 1961 • 1067 

s g^ fe s&ft as" ^ 



130 



127 




ItfsMfMllypt " f i ■ m wi mmy y Al r w i u iim whuJ nbto. 

rf— whliij liirftlMiMw»WlM»«it ^**Mfartfctacfiy«MiW««ilii«eM(c«- 




<i«nt*iMiiiifoftWni«M{iif(orhM)af«tcilKer^ diM «rr«iy WentiW 

I0« • Ocroin 19SI • Amducan mamuKm 



131 



128 ' 



MM iiliy iw ii | ii H » lapi m i l fcwt mU M i m hr piip iM Mr to ii w liii i < 
MaM MMd(AMMl.lMllMli•O^M 



I *i<l*n fW<i M <> 

MMri*«*t«MMli*t«l 




*iM*t9*rktta|iMMiiir*Nii**t -f— ■ — ti tT ifriiiiiii nii< i 




, _ , MMlMMMStadiMSSlZ 



AMttieui rncHouoar • Oerammi • ion 



129 




abml hinmrn. Colt (1173) prafow • 




JtolW QiMt^ IMO^ Um. If73( rMtiM « Ntvkfc. 

• im SdM* t H«iir. 1IT«. TWt Mm 

...... . - tp iw i fc ti tfct II dttr ihii ptiUm bkmrn 

tiM tkMt *Wl W 4«M !■ Mlnttia fa f^ -*"^ ' r'-VilTrt rfawj |»^Miu 



. «iilltM tf ii luM ti fahMn « • SKkJ poUev 
rBKka^M(MrdM*W«HttetiMpraom (MMkk'sstcsarfqacsit^ruitiw.&'XM.. 
nttopii iM i n ilii H ili lH fcfc<iM«ctfaiW' dtloafat u fctmw M ttw UH i i wfalfif CMlttd 
tm «ri affMprfMt ttltcKM ftfedtt I* AOtraM itcMal eadwioM •boat wh^btr 
cwWif faii f ia i iMi t ii l iifci H iMiirt»tt» •M«iifa«»facll«i«witak«rkMt|«k. 
all* cMMMtf M fftAdhw Mmney At«U Cm i>4 St (ItTS) tad PMtma u4 No«kk 
Mc iMi' tf y W w4 la nfadtakTar amaplt. la QtrD paW tftt way it «iiiiikal dac^oa^lMa. 

Mie fanMhtltat al Mitettaa fa wyek itdtl vat- 
Mt art «a^ taplMI at -atOWai.* ly «aht ar 
(kt qaaaUM, lafaiha iaportaM* 
" For 



4kf«rMrfbtaitri.atHitaaM0>lMtafgat^ •Mflt.tftatcaariMllaodaOybapartutto 

Hamr.»kaaiktayritaali«faattiirfHi|. l»^UaciMe»ollc«ta«lWpQlk«faNt.aat 

■ tal ifa ia l it J fa l iii n ili m fapr»>riafait. aifk tl* at|«l al aaplorat tttacttoa 

fadtatpafcyaftimfataailibaeaaMtdtarlhat lalalha »i acfaal |* parftnaM Uadar tbaH 

At iMt al apanpriaMr Mfaff • vaitoUt M afatWfailMMi.llk|naribkfaeaapiifa(Wtf 

fa we l l ! t H a h yta M l | )fa irftrtha >ffat^ fa. Itcttainlt Ikl tfct tpteal itltdfaa oat- 

««»»tt «afat >i % ia m i. aal tochafaat wkdttr ctaM t u tilh ^ fa • tptdlad art a( vafai^ Dif. 



IUilral««ariaUtat«fft«etarta4annvriafa «f«tal tfteal taltcltaa nil« TWt dtcfaoa* 
i t ri t i p t te ti (fa ilfa u w. wl nl lta f tl lri ti ) mw n tktmtfa aaM aaW Mpftdt Mt a( iW vahw 



badttrtyii i| t I faaai naraMa. Dtritafita. a«l Colt. 




aoitt tfcal II b paa*fa aaJtr iW m ii rt w ay p< pofaaUtiy awawfrf i|,|c„u that ty aoi 

prtackfa pnMrfatiJfaitltctfaaalaaMlarfM. afaoarf fa « eaapanUt coawa f« tfaat po(«a. 

rofttaatalaat|m»ihai«MCkar«««atfcau|h tfaty tai me ta f rf. UaJwtfcbfaaaJtr ftaawA. 

tJfapoMailal«KX«itafatfaiJfat«apvapt(Ifa both iIm ntnullka aad tit Cofa propoMb eoald 

applkaaiika4WaaaMad)«i|fafaiaM*ite. I» aaaa at lo|tetflr cwfatol dtdiloa^haorttic 

lltr.Ht lkta NnirtiiJfapMlUt pokey afi*. tthtfaat wdtr partiealaf typct al nlati 

lactiat(faia«lMal|i«apifapra|iafifa>topail fa • dfaa|Wal aofa cafaafaaUng 4acUoa. 

McetBratatfalWtMapt.'nMniAka'ifafertp- tWttte ippitiifc la wltctfaa him, CtmUck 
(l»7Dar.niiifaltliataa.,affaatfa%aaoiUB. 

pellcy.bMildlffanri««ilfaitlcctknn4tdtrt*ail iolv«d and -»« aot fa lanM tiy mthtnutka] 

frDmtWpndIdh•vaUilytUlMlpoialakM•,Ot^ tptdalfai-(pLS|).b(«byeafaMlyftt«ininitpd 

Ifaftoa (IVTl) aefat varlova poirifaly reawMblt dtUttof Iht USttrMt vdut pentloM csproMd. 

(lM<lincfMii)ao(JoMalbUiandiuttcttaapro- HMt. Uik lUlliiical apptMch Wtpt lo <tiitin|uah 



1070 • OcToaa 1901 • Awujcu* rncmuoan 



180 




AtWMMHtWMMf • Omw INI • Mli 



134 



131 




ni it i n irtiiirf»m>i>» Lfc i H i>ia>^ art wy n nUi Hif i l f wwwihuna^ 

wiwlwrrf— tlMti r lli H iM,»d«>.Cn. r"*"?'!?**'^!^'!"^^^ 

■M>JfaHln(MM>>w»yfw Ji^J r Mi n iiiil ^ Jtf f h iNa cW i ifhu, 

»H»wrf»>,i,Mt>« ni l » "^^^^'^^^^^ ktf«y«Ht 

«»i"<«»tWiiit IfiiBiliiiiiml-ipMt MM «iliir«riw Mara 

•IvvfkML -Jfc— If mil J\i_ 

^MiMct m lw H —iitlf AlHir<rxril|W«>i Onw (Milt 9) «mlM ife pw. 

CrfbMi^lM4;ClMnrAWlMii.imPMMl M(ri^l»«i»,MeilK«i«p.McJ^ 

«i«Mfki«llWMnlJ«^«rMMMlMaM V*^ ^ tUi Wlw • imUm bm 
M.a«il>tiflM««kkfcrMkiliraMimiW y<!tl»tM«^«>dlM»<Nol««)pN^^ 

A2irfr«i»««ar4^i.WH^ i»«Mri-h.^ Trt — am. Not t) go. 

MralH«(KiiHl]r«i iMMiMniidkMfaf. pMaMkiMtMi|bBw»«fme«M 
iiitt««rife««ii«Miii»far«.p«, ifl^g^r^^ 



-— -* »--■■---- «» — _ ■ « ** ■ - ■ — ■ * ■ j.i _ j ^ 




if " V ? *"*" - >^ >^ *^ 
ivs» aaplM Weta* mauH*. 

•auusratMniinATsncALnvDio 
ton • OcroKH IMl • AMcncAN Pirontocsr 



133 



tnMiuajuocMOfnwmrGQMmucTiQii ii IHlii liwiiiM mi IdJS 



lllilt il l l , lll l w,^ 



NnHiaM«IMdifMil*CffM4ftll«»l.iL. L.I 




n* wilk M Mm kM trhM Ml il • 

M ICckl MBMra itVtaUt llMtNMt il 

^•ckl imyilii tyiadMy.nkcom hi W« 



n-jfcfcMgfc ,«JWvt ««Jh, A,^ 
•vf'Mt il aiw y iij kU ^ il M mkmS 




1074 • OCTOtn 1061 • AMCMCtM hTOWbOCVT 



134 



mfmm»l\mM,l^ikmhm I Hiiiitf 

WWl lr i i ll H ^twI w Hiili «hMl«hMllMtffl««MkiMNMfck.tlH 



»mmm9ihfmmmk,m»mmtMmi^ l»<lfcn4iwli«<tol»iM! iiittw 

li» li Mfct dar whu Min. tf Mr. «fy«l aMly yvps !■ At MkM «yyiiy 

MiM fct Mfcm. |« Am «M MMi^ fw. MMlBtlH^».«tMy«iWMi«lMy««M 

ilMiMiiMl]r«dLri«bUytlHMiM9»> Mw ■!» ito tfct wfrfirfit»tiM 




) «M art f«M IB il» am if !■ tlw Mb < 
hctal Mm. Cmmw dH ■ffnmHt if 

rfftctMy niM^ «rf hm mmM to d«r f*. ^ „ _ 

fwMlhMhM Mp MlirftlwMttmirfMnr. IhM m air «if My li ykU Mdtt witk » 

H«mr.iMklidblkiMhis«ppMMll)rMMt And « irlipilMtJ yticy JtcMw. 

dfktMaqrpMp'iMmw.aNromliVtlH ftoiiy, mJ -<twiy f iiiw rtU iw hanwd 

iir<i ti> n i l wliin J tW rmt rnt M mM k- ilitt wMtbr ar Ml Mb «• blMi. «Mk mI* 

Mi> MiyaMilpMtiftlHeMfbiiMyyofeeytaiM 

■oA W iiJi n Mi wfcnMrf wtHMly m« ticfcntl» lm Mi f ^tl»<Mm.MJ<fc»dMMfy 

lih-i im M tfc df Kr toi l Jl mm lMiihm al Win Tt pNMW ite tUt bvite ta« m 

topfciUMiftrii iyf i i i j iwH iwrftlwtryt iMMlilyiMMrfMlUtalilikateM.niM 

raiMl«W««. HMWw.ta^tWMMaritiMM ftky < i Amt «^ tWt 

•MMrfrWliW«kV«PlWkbMMMlWyM MMhiW^»lMtUr«MlMk«tb«M. 

M<t«JtWt>Mr— fcirtwi wntHjMJWcuM 

»katw«lMM4MMi|f*«MtaMi»«at»ilH ttnuNCiNmi 

wdilfiicy H l tl iwl*t wi HuMJ tlwwiltfc. t>-ii<LA.OlM littt^hTli^— ifc. 

ftM«awiinite*Mt«ill*M*%teM> •MvnfM^A««^niH«iiMtatMMMT 

Mi**M.dMr.M«ta9aeiMi|»hM " i' VrVy t m " 

w l ilii m iiirfiwypiicy.AgyhwMtMc» 4 Cw^aa^a a^^r. i,i i,|H.i<^Mi 

iWMtfyMrfM tiiiitAaiirfcalcdlwl !Lr^^f?!*^'^^"^«*>«-*>it^ 

Oitl»Mt1aliy>.MW|hinHMmHi^hm Slii, " ' 

>aMh>iM<.MJlfcmlnfc»piMiMrf«wtli 4 « " ^ , . ^ ^ ■ 

M»iiMiii»aMifa|MM4arilMlMif^ ^ * 

■"-'i—'riiiiiiiiniiinntiiL imSSif£i 



U l . n l. ■ fc n tfawl H i i if, „ f h lfct«11 &DZ?rrM«MM.I»M.MM.lirM 




hi|>fc«MiM>wl>MMlMlyii|iriinlrtM,fcf ^ 

IW MlinllM Ml MUag «f MMy lilJlvtf«ill & kMCItCMiVMN.rilMMMlMNa.Mte 

«lMdMrtWyai«M«Wnif«nMtymM« MMMrftahtaliltoMiMiMMMM 

not !• hd. M MM kMy lU MtciZfteM^ I*'r'^ *' i*^Hy.H.itiry. . M , ll id»4 

tlAUteS ikM BMi «f IMMtk »il COCM i* Uw " " ' - °^ 
AMniOM rSTOtOtOOST « OCTOtU IMl • tOTS 



138 



135 




107» • OcToan 1«1 • Amoican P!mMoux»r 



136 



_ _ m^tm^t III W| I 111 M WimMl ' ^ 




Amcmcun rncNouKVT • OcroKi mi < ion 



ViO 



137 



The SAT 
in a Diverse Society 

Fairness and Sensitivity 



The SAT undetfioes 
meticulous checks to f!uard against 
ethnic or cultural bias. 

by Thomas F. Donlon 





Rfprtntfdfrom 

T^THECOtLEGE BOARD 

Kgview 



188 



by Thomas F. Donion 



The SAT 
in a Diverse Society 

Fairness and Sensitivity 

Thf SAT undergoes 
meticulous checks to guard against 
ethnic or cultural bias. 




WHEN THE cotteoe •oAKO intro- 
duced the Scholastic Aptitude 
l^t (MT) ia 1926. one of the 
oDjecitvet wu to provide coUetes with 
astistance in copiflf with a powinc di' 
veretty among their appUcanti. The 
Commisiion on Piycholofical Tnls, a 
group of eminent ptycholo(t«u. had 
been given retponiiHity for evaluating 
the suggestion that there be an sat. and 
It began Ui report in 1926 by citing the 
change! in enrollment: ''Statistict con* 
ceming higher educalioa very i^ainly 
ihow the numerical iacmie of college 
population . . . The natural conicquence 
IS that many inititutiona have sought to 
develop more adequate oicans for select* 
ing from among the applicanutiiose best 
fitted to profit by tiw oi>pofiunltjes of- 
fered." To provide "more adequate 
means," the committee recommended 
that there be an sat. At th» point, of 



ccurse. the College Board had been in 
existence for over twenty*five yean, and 
annually offered a number of achieve* 
ment tests But the Commission saw the 
limitations of the Achievement T^sts. 
with their heavy dependence on cur* 
ficulum. for these widening applications. 
"In some cases," they wrote, "limita- 
tions of educational opportunity would 
seem to be a factor in causing low marks 
in Board examinations. . . This would 
be expected, since *he Board examtna* 
tions meuure specific preparation. ... a 
candidate whose educational opportuni* 
ties have been limited has a much better 
chance . (on) a test which is not a mea* 
sure of specifk preparation ..." 

All in all. this fundamental premise for 
IheSAT.ihatitoffers" . abetterch^jice" 
in the face of 'limitations of educational 
opportunity." has been fulfilled But as 
the world of college education widened. 



the SAT. because it was not a measure of 
specific preparation, because it reflected 
attainment in a very broad and general 
way. came to provide a meaningful com* 
mon yardstick for facilitating the ap- 
praisals of an ever*expanding and in- 
creuingly diverse population 

But this general success in measuring 
varied groups did not blind the College 
Board to the possibility that tiiere could 
be problems with the interpretation of 
SAT wores due to population diversity 
Precautionary studies of the test per- 
formance of males and females, for ex- 
ample, were conducted from the very 
beginmng, reflecting the stix>ng concern 
on the part of the Board that there be no 
inap prop riatedifTerencesin performance 
In general, such studies demonstrated 
the appropriateness of the test for a wide 
variety of groups The sat. by virtue of 
the breadth of its coverage, and the care- 



I 



142 : 



139 



ful ediiing of iis con(enl. is a balanced m- 
sirumeni wiih relevance for a variety of 
candidates Although it is now only two 
hours long, it covers a range of topics 
and tasks that tends not to favor any one 
subgroup 

The questions of its fairness jrKrcased 
in frequency, however, with the emer- 
gence of a strong national concern for 
equity in access to higher education in 
the 1960s and 1970s Is the sat. in fact, 
unbiased'' To what extent, for example, 
is It equally approprute for men and for 
women "> In general, the answers to such 
questions have been posiUve As the 
cdlege-gomg population has grown in 
numbers and broadened in vanation. the 
SAT has continued to be appropriate for 
candidates with widely diverse back* 
grounds ind cumcular emphases In re- 
cent years, the efforts to keep it that way 
have been intensified However, white 
current evidence based on sudstical 
comparisons of subgrmips confirms that 
the test IS basically fair, there is a need 
for a continuing program of review and 
analysis of the test from this stondpomt 
In the 1960s and 1970s a number of 
studies were undertaken, and since 1973 
the types of questions used on the sat 
have been penodically appraised for their 
appropriateness for differentgroups This 
•rticic provides a description of the pro- 
cedures which are used in the test de 
velopment process to ensure appropnate* 
ness. and the principal approaches which 
are used in the sutisticaj analysis to 
evaluate test bimess 

Fairness In Content 
There are several ways in which prob- 
lems of fairness may arise regarding the 
test Some of these are readily obvious, 
are reflected in clear imperfections in the 
conieni. and are apparent to a reader, 
others are hWden. detecuble only in 
some characteristic of the scores. Thus, 
inappropriate content in a test may be 
outright offensive to certain groups. ei« 
ther through the portrayal of negative 
stereotypes, or through the diminishing 
of a group's importance through a failure 
10 recognize tt. Each of these flaws may 
create problems for test ukers who no- 
tice the content defect and whose test 
performance is affected by it 

The more obvious problems of faulty 
content are readily avoided Such avoid- 
ance requires a certain vigilance in edit 
ing. and a knowledge of subgroups and 
their reacuons. but there is no special 
mystique to the process Since its earliest 
days, the sat has been carefully devel- 
oped and reviewed so as to avoid ma- 
terial that may be offensive to anyone 
The contemporary concern is merely an 



extension of this traditional process 

The effort to screen material is largely 
successful Occasionally, a reading com- 
prehension passage may generate con- 
cern on the part of someone who dis- 
agreed with It. Generally this has hap- 
pened in the context of the so-called 
argumentaiiv.- passage Each form of the 
SAT from about !950 on has had an argu- 
menUtive passage, described in the spe- 
cifications as "the representation of a 
definite bias on $ome subject." and often 
such passages present an impassioned 
argument for a fairly extreme position 
Questions based on such passages are 
intended to test the candidate's abihty to 
spot a specious argument and to deal 
with strongly opinionated material In 




spite of disclaimers by the College Board 
or Educational lasting Service (ets) of 
any approval of the opinions expressed 
in the test matenal. there may sometimes 
be a reaction from candidates, partic- 
ularly from those who arc opposed to the 
panicular viewpoint expressed Argu* 
ments against athletics, or democracy, or 
a graduated income tax do not upset the 
vast majoniy of the candidates, but the 
range ofdtversity among candidates is so 
great that some small number (believing 
strongly m athletics, or m democracy, or 
in a graduated income tax) may react 
with concern The problem has not been 
unique to tcsung. of course, it pervades 
all of education in a society such as ours, 
in which a few people may feel very 
strongly about some things in a way that 
the vast majority does not. Granting the 
minority their rightful voice or Influence 
IS often a dtflkult matter. It is. however, 
an important problem and one which 
must be dealt with 

In general, such difRculties have been 
relatively minor The test is conserva- 
tively edited, and it is not an instrument 
for social change Throughout the years 
from 1926 to the late 1960s, consistent 
with the contemporary trends in educa- 
tional text books and the media in gen- 
eral, the basic standard for appropnate 
test content was simply that it should 
reflect the mainstream of education and 
of life- the majonty experience While 
no overtly offensive or objectionable ma- 
terial was allowed to creep in. the con- 
tent of the test, in samphng from main- 
stream prose, avoided direct reference 
to minonties or to minonty-rclated 
problems. 

Beginning m the 1960s, the prevailing 
treatment of minonties and women was 
widely challenged m society The pre- 
dominance of white male role models in 
the media and m the arts was viewed as 
overstated, and as inculcating expecta- 
tions of sex and racial differences which 
worked to the detriment of women and 
of racial and ethnic minonties. Wide- 
spread changes began to appear in news- 
papers, in magazines, and m text books, 
as the language underwent a ripple of 
reform and as "Ms began appearing in 
correspondence everywhere Suddenly 
there was heightened awareness of the 
absence in the medw of a balanced treat- 
ment of the sexes and racuJ and ethnic 
minorities Reflecting these national pat- 
terns, the SAT began to change The 
policy against overtly offensive content, 
content which could be upselling to any- 
one, was. of course, retained But a new. 
affirmative policy for the test emerge! 
Not only must negatives be avoided, in 
the sense of derogatory stereotypes, but 



140 



Ihirt mull ht what oni MMk tduoftlor 
mUfd *'rf«pKi ilimUii'* poftiilvi ii«k< 
nowNiimini ot* ihi iiUtiMf and w- 
(ompllihmtnu of vaHoui tihnto and rn' 
clil iroupM and ihi dlvint hiKoriii liny 
rtflMl, A Mlurt Iq dial ofitnly with 
ffllnoritlii would no ionftr do, 

Thni toK9% win rt iiwimM to, and 
Ihi ifflinilnf pflitirniart rtfltvitd today 
in rtviifd conitnt fpwiltoatloni that 
rtqulrt. (br inamplf. ont mlnoriiyi 
ortiniid paiMf I In tach form of ihi mt. 
and an appropriati varttty of rtfiirf ncii 
to woman and mlnoritki thmufltQut iht 
malarial. Thau vhaniii apptar moit 
vividly In thi riadlnf comprthtnaion 

Ki««iiai and In thi Nntiiwf vompltiion 
mi, whtoh havt mort Mm than iht 
analogy or antonym or oihar lypti of 
Qutiiloni. Tha Tkii of ilMKlanl WHtta n 
enyltih alio ooniliti of maiariat whkh 
Ban rtflt«t eultural dlvanltyi and It. too. 
Ii aliooarafUlly contralM In ihli manna r. 

ThiM Mrnnlnf I ofmatarM fVom va^ 
loui vliwpotnti Iwvt comf to N calttd 
'^laniltlvtty" rtvkwa. Thanaw prwilcai 
art, of (ourH. not llmltad to Collaia 
Hoard taiti. hit art app)M to all taiii 
that t^Taprtparti. 

Saniliivlty rtvitwfrt voluniHr tw 
ihalr aiiiinmant. Thay ar« primarily tait 
davalopirii ilnaa a knowladia of tha 
lul^afit matiar ami covartd In tha taiti 
li Mninlly uHftil, Purilifri many rt' 
viiwanart mamktrtofmimtrftyyroupi, 
Rivlawlni li not rNirkttd tominoritlai< 
huwavar. Tha luidatinn alHriy iiata 
ihai "It ihouM ba itrataad thai minority 
iroup mambanhlp ti Ml a mandatory 
prtrt^uliila 10 ptntorm^ aaniltlvhy rt< 
vliwi and Mrioui aomidafMlon <if) 
llvin \a $11 {"laraiM . . . atofwlw voi* 
unlNr,"W' 'titifaiidiaiiMiiiiMflly 



lanilllvliy and awaranaii that can ba dr 
valupad ihrounh training, and that an> 
aMai a rtvlawar to lantt whan malaria) 
may ba olTlrnilva, What'i ntcaiiary ii 
tha ability to ravlaw taiii fVom mulilpla 
panpKtivai. not ilmply from tha vlaw< 
Mint of a ilnyla lubumup or ioclal/pi)< 
lltical ptripa«ilva, 

Aa thli dliGUiilon wwaiti. iha lnclu< 
lion of oaruin maiartai li ai Important, 
if not mort to, than kHplni oihar ma< 
Itrlal out, For aaampla. a ouattlan fVom 
tha T^it of Standard Whttan Hnfllih 
mlaht apptar in alihar of two variloni.ai 
follow I, 

Vir«kin A Tha na^ly tn*citd ttiiUnibw 

rt^ulrt « iMi all mtntift with 

rurtd PulHikiKwt lo rravKia 

tranifHrnnuoA la ihf ptilti ind 
AhttnitiMloii. Mairm. 
ID) (P) 
Virihin ft Thi nf wljf tnMlfd )f altlation 
- \Ki '^ 
r«f)ulrf I thai nil vouniiit with 
(R) 

Ipunuh «pHkina poful«ti0fl« 
(d prnvWi bdlnau*! rffUtre 
(D 

(ton and alKlkin milinili 
ID) 

Natfrot, 

m 

It li nbvloui that tha quaillon Mill 
mtaiuraft tha lama fondamantal polni 
about yrammar. rtianllaii of iha«ontant 
rtfiirtnet whhln which It li f^mtd, Tht 
point la that ipMllto conitnt U oHan 
MwHIary to loma oihar purpoN In da> 
ilinlni a quaiilon. and ihal tha modam 
loal «l » tail that li rtfta«llva of cyltural 
wvtnity may ofttn bcMhitvad by adapt* 
IflgtlwiMtariaii 





In loma Achltvtmtni l^ili. howtvtr. 
flutitloni dMllng with tuch malitri ai 
tht mifrallon of black i fVom rural i» 
urban communltlti. or with womtn'a an* 
rollmtnl In courata In ultnct. may ha 
dirtcily uktl\il for mtaiuHni tht out* 
comt of Iniinictlon and itudy Nuch ml' 
nority>rtlaitd qutitloni will btlncludtd. 
but only afttr a cartl\il rtvltw for ap> 
propriattntM both fhim a eoinltlvi dl> 
mtnilon and an nffKilvt ont. At tht 
lamt iimt. th^ muM bt juditd to mtti 
tht itntral iiandand that lhay art "both 
rtltvani and aiianilai to afFKilvt mta* 
lurtmtni," 

DlrKt qutitloni of ihli lypt. calllni 
for a knowltdit of minority matttri. art 
mort ofttn llktly to occur In hliiorkal 
tubittii. Ilitnturt or llttmry lulUtcti. 
Itial iiudlti. and piychologlcal luhjtcti. 
Tnart can ba no mtchanlcal auldtllnti 
for dtttrmlnlfli tht daclilon of what li to 
ba Includtd. and In what form. Tht pol- 
icy tuldtiinti for rtvltwtra ciU for iht 
tKtrvlM of '*prudtnt iudimtni" and a 
conikitrat<on of maitrUI rtlailvt to "tht 
conltki of tht tnilrt Hit." Tht pollcltt 
art not pollclci of tAththn', Ihty art 
policial that wak to mlnlmUt tht pottn- 
tial for ntnallva rtactlon to material and 
to rtqulrt that all mattrial ba Juitl/H by 
loma fonction that makai It nacaiiary to 
UH It In a tail. 

itiUHkalChKlu 

Tha naw wavi of rtvlawlng tail conitni 
art not baico on tha aiiumpllon that Iha 
wor« patiami will ba difflirtnt for any 
modlAtd iltmi. That li. chanilni a itn* 
lane a comolallon quaitlon tram a rtf' 
trtnct to Abraham Lincoln to ona man* 
tlonlfli Suian R< Anthony or Martin 
Uuthar Klni doti not uiually altar tha 
luccaii rata on tha quaitlon, Tha naw 
policial art Juilintd by valiin. ralhar 
than by itatUiici, 



141 



SmtUiici, howtvcr. imnonini, iml 
ihc lAT li c«r«rully iiudtid Tram thti 
vtowpolni. ilto. Th« btnk •Mt(»ticid fkcti 
Owl om«r|i from vomp«riionii of kotm 
m fiiriy well known wiihln iht tduci. 
lionil cofflffl unity; nmoni iat cindl' 
d#iti. fflilM do •ubilintjilly btttvr ih»n 
f»fflal«i on mathomaiicil mattrial. whilf 
whit* miUorliy iiudfitln of ilthtr nx do 
Hfiitr on ihf lAT'Virhti Hcikm than 
coHni#rparti from luch minority |rt>upi 
M Puirto Ricani.blacki. Mi xlcan Am•^ 
Icani. Orltnta) Amirlcani. or Natlv« 
Ami rlcani (Indiani). On ihi aAT^maiht^ 
mailcal mtion. Orianial Amiricani do 
bail, followed by whim and othtr ml' 
noriilai. Thf If patli rni pom challtniifli 
problfmi to tail iponion who muit 
Ihow that Ihay do not rt luit ttm loma 
flaw In tht tf It malarial and thai tha ttii 
wort dlirartneai rtilKt dlffiirtncti 
which will afford valid pivdlction. 

Tha bailc way to look at luch wort 
dlirartneai li to compart iha pradlctlvt 
powf r of tha aAT for tha two nui or for 
ihf Mvaral mlnoritlai. Uilnf Ihii ap* 
proach. If tht ta it li not btoitd. two cMf 
didatf a with tha lama acort ihouid pt^ 
rorm about aqually wall in coliaia . rr 
lardlfii of thair lubfrouo ma mbanhip, 
Tha numbar of iiudla a of thli typt c«m' 
illtutf I a Mriy voluminoui ilia miurt. In 
1978 tha CoiifM Hoard iponiorad a 
tummary by Brtland* that «onaidtT«d 
Ihf UHfblnf II of IAT and of High School 
KKord (H9R). amoni othar maaiurt i. 
in a varif ty of ttudia i of tht collaft ad' 
mliiloni procf II. In Mntral. thU lurva y 
lupporti thf appropriatantii of tha aAT 
for many populationi. It may loma timt • 
M adviiabla to dtvalop a iptclal pradlc> 
tion aquation for a yiva n minority f roup, 
linca lAT aquationi baiad primarily on 
whila malai may ovarpradict for blacki 
and undarpradlet for famalai. Hut tht 
IAT olTari prtdlcliva vaiut for virtually 
avaryiroupit aneountari. 

Evan ihouih luch itudiai of pradic> 
lion land to ihow no unfkimaii in total 
wora on tha iat. tht fket that thtrt art 
lubfroup diliirtneti In avaragt acort 
lavtl cannot bt tinorad in a tait that U 
wtdaiy uMd ai ona aiamani in coliaga ad' 
million I dKiiloni. Tliart muit bt an af' 
fort to nphin thli obiarvad icort dlfl\l^ 
a net. in ordtr to promott tait himaii. 
Accoidinily. additional mathodi that 
conildar tht ptrformanca of iroupi on 
individual quaitloniand cluitan of quai' 
tioni. nthar than on tht ttit ai a wholt. 
art uiad, Art any qutitloni inordinaialy 
hard or Hiy for cartain iroupi? Art 

* Kunttr M, RrfUnd. ffi/^hilMyiti^iiKd 
CtHtftt hMttntt Mfumt Ntw York' Co) 
Itif Kntnnct KtimlMilon llMni. i«7«, 



i 
I" 



10. 



'f-n-^— 1-1— T'-l— T-^f—j I ; (■ 



Qwiiipni ctovf III* ti«f vf 
rfliMvfly «iii«r for dtKk 
imWi \ 



Um o( bM rtnum ntoUoMtup 



Qmttim Mow it» tin* m 
rflatlv«ly «Ml«r for whlit 



• » llaikpntiiu 



6- 



' 1-^' ' ■ I t I I 



12 



I I. 



J- 



Dlfflculiy tot Matched Sample Black Matci 
M«km«lM»p(iniMmii«lyft)uiliflVfrt)*U)4iHy 




^wMtoM Mpw dM (UN an 
rflfitvffly «iiifr for wklra 



• » limit potM 

MnWflcpotaii 



I I .L 



H 10 12 14 16 IK 20 
Dlfflculiy for Matched Sample White Pcmaica 



riMlf I MK HmtAt* ippntiimiitly *^uil in vfrtwil itnliiy 



142 



The Science of Test Fairness: 
A Closer Look 



One aspect of an analysts of diiferential difficulty isaoex' 
amination of individual questions that seem to be farlbest 
from the line of general tendency (or line of beat fitting 
reiatioDship in Figs 1*2) The fact thai tbey show greater 
than average diatanccs tends to raise questxms about 
their ftioctioaug. As mentioned in the text, such individ< 
ual examinations have not generally been fiuitftd so Ear. 

An example of a question which was relatively haider 
for blacks wu. 

RUNNERlMARATHON.. 



(A) envoy cnbaMy 
(8) martyr nusacre 

(C) oarsauu) regatu* 

( D) referee toumameot 

(E) honcMaMe 

* Correct answer. 

Although "Regattas" are less frequentfy aaaodated 
with the minority experience, they are fiu- from common 
for most of the nuOonty. as weU. The statistical analysis 
looks as fdlows: 



Croup 



Omit A 



Whiteiample 3(») 7(») t3(») 53(«) IK%) «(«) 
Bhckianipk t 7 12 22 31 21 ^ 
* Correct answer 

When the percentage of candidates eJcctfaig each of the 
wrong anawcn ia compared* it appears on the sufface that 
the qoeatkm worked Bome%yfaat diffmstly for the gCMsal 
sroupa because answer choices (D) and (E) totether 
proved twice as attTKtive for the black sample aa for the 
white. Is it possible that experiences with or the meanings 
of "referee:toumameot," "horse: stable** are diffneat for 
the two groups? 

Before acc^ptint such an hypothesis, however, it most 
be noted that the Mack sample and wUte sample la the 
previous comparison are not matched in ability. Tb aoer- 
tain extent, then, the sett of percentages describe noo* 
compaiable groups. When the black general sample {s 
contrasted with a more equivalent sample of whites, the 
foUowfaig results appear: 



uig asubgroup of black candidates whoare approximately 
equivalent in score level to the general >^te population. 

Croup Omii A B C* D E 

Whitesample 3(») 7(%) t3<») 53(*) llfft) 6(%) 
Matched black 

subaample < 10 15 47 U 5 
* Correct answer. 

Tliese blacks respond to the question mora like whites 
of equal ability, in general, than like the black or wUte 
groups that are lower scoring. In sboft thace ia M clearly 
apparent racial/ethnic content ia this item, even tho^gk it 
appears on the suiftce to be relatively more diflnk for 
blacks when nndoo samplss art iiMd..WliM (te re- 
sponses are examined in depth, the pailtiM ef siMcesi 
and ecrx for racisBy dcaaedtroupt of coMpai'aMs atiMty 
are not diflsreat What setM.at Ikac ttaM^ Itt bt a 
rttir ratr nf niftiml ttHlnanra Is mm ugwrlnesr hwiiiar 
tk» to be simply a tmk ct H ft r w ii tlioclrtsd SfUtt 
score levels. Th» «wsI1mi lootodMbteitloinNl^eldtf- 
ftnot ecoiekvds* biKitdoes>09iMtaiiormel'{;:^^^^ 

ThawqMMtloMaretyfictL Aaolkg(4ii9tlloa«w^ ^ 
mjporiiissdsnwstiatadrilrtvily/aiwatfaMito i 

It is aadMdaUe that Cscvaaiss, Aoaib Ms mtk Iwt > 
tha^«hlchitdsseribad.wasacMderhktfiia: . 

(A) 

<o 

(D) 



Croup 



Omit A 



Matched whhe 

Bubumple 6{%) 7(«) 9(») 26(«) 3t(») 16(«) 
Blaclnample » 7 12 22 31 2» 
* Correct answer 

It is clear that matched whites perform ve ry stmi- 
laily to blacks. Tie clear implication of this is that it is 
not a black subculture which influences these patterns 
(for whites are not raised fai that subculture), but some 
sort of general lack of knowledge or understanding of this 
item. Further conflrmatioD of thb is provided by compare 



* Comet aaiamc. , '>^^p. . 

Again, it is assftd to a4M r»«,4^(t by/dtimMor^ ' 
mors comptfaUe graupsi Tke IMa^mii itews fh*^^" 
p s tfofi aaa c s of fbur groups; ths g wi w i siaiflM li ^ tWQ - 
spedAcatty salectsd subsamplas. Macks aHtoM to «npi^' ' 
cal wUie ciadidMes. and wMtM aiikhid tetyvM IM^ ^ 



OmU A 



MatrhttlW tffc sub- ^ "y^'''» * ^ 

Osas i a l wMi s sample g It 19 » ' ' 12 ^^^2 ^' 

OwwalUacfcaamHa 10 12 17 if |4 - J5 
Matched wMs sub- ^ ^ ^ 

sample I I? l> t 14X151^ 7 -5 

* Correct answer. ""Tl 

TliersiSMtagrsaldaidofdiikieaoebstiMt^ 
onthisqasslio«.evMiwksathed«caltyisMt«#Nlld. > 

When the adjustment is made, the siodafity is slro«|. 
Dtstiictsr B U the most popular cboiet for al sanpMs; 
the remaining choices are very evsaly dislribaCedTB. # 



1-16 



143 



Ihere patterns of question content that 
might explain difference* in difRcuJty for 
different subgroups? Total test score std) 
figures in the analysis, but the methods 
presume that the test it. on the whole, 
unbiased 

The approach may be called "differ- 
enual difficulty analysis.*" for it tests 
whether those questions that are difficult 
or ea.y for one group ire the same ones 
which are difficult or euy for another. If 
they are not. if many questjoos shift posi* 
tion from hard to easy as they are ad- 
ministered to one group or the other, 
there is evidence that the test woriu dif- 
ferently for the two voups 

Probably the quickest way to describe 
the method istoresofttoadiagnin.astn 
Figure I. The axlif on the left side re- 
flects the difficulty of the questioiis 
(items) for a white tampfe. whereas the 
bne on the bottom reflects thetr difficulty 
forabtack sample. The dau are from an 
analysis of the form of the SAT-veital 
section which was given in April 1975. 

The numbere that describe the difR- 
culty scale for the questions are called 
deltas, which can range from 5.0 to 21.0. 
They are based on the percentage of a 
group answering the question correctly. 
A delu of 1 3 is yielded when 50 percent 
of the group select the correct answer. 
Very difficult questions (10-20 percent 
pass) yield deltas of 18-21. very easy 
ones (80-90 percent pass) deltas of 5-8. 

The samples of candidates used in this 
analysis were a random sample of black 
males and a sample of white males ap- 
proximately matched on a verM test 
In general, results are clearer if the two 
groups under comparison are approxi* 
mately equal in ability, as they are in 
this example The points in F«ure I 
indicate that questions vary in difficulty 
for the two groups in similar ways. Quea- 
tiofis that are euy for one group tOHl to 
be easier for the other; questions that are 
harder for one group tend to be harder 



for the other Most questions cluster 
closely about the Ime of best fitting rela- 
tionship. whKh is not a statistical re- 
gression line but one that minimizes 
distances m both directions. 

Another plot is shown in Figure 2 
This shows the performance of white 
males and females on the 60 questions in 
the AprU 1975 form of the mathematical 
portion of the »at. Differences in per- 
formance between th^; sexes on mathe- 
matical nwtenal have long been ob- 
served Again, however, there is evi- 
dence of consistency of difficulty be- 
tween the two groups. 

Figures I and 2 are &irty typicJ In 
general, if a question is relaUvely harder 
for one group, it is relatively harder for 
the other. The items all cluster around 
the line of best fit. in a relatively narrow 
band. 

The first step in any evaluation is to 
consider the basic overall information 
concerning the consistency of difficulty. 
How different is the white and Mack ex- 
perience oftbe questions. as indicated by 
their relative success'' Do the questions 
rank in the same order of difficulty for 
both groups'* The typical answer to this 
question is a correlation coeffictent. the 
st^stical index that shows level of rela- 
tionship from 0 (no relationship at all) to 
+1 (a consistent relationship). Ineduca* 
tional and psychotogical litenture. cor- 
relations of .95 to .99 are considered 
very high. But such correlations are. in 
fact. w4uit the typical fairness analysis 
for the SAT shows. In six studies per* 
formed by ets since 1973. the correla* 
tions of item difficulties between whites 
and blacks of both sexes were between 
.94 and .98. 

It is important to note that what is 
being correlated in such studies are the 
two seu of questi(Hi dtfficulUes or deltas, 
one delbied by white performance, one 
by Uack. The high correlations mean 
that the rank order of difficulty level for 



questions tends to be the same in the two 
groups: a question that is relatively 
harder for blacks is relatively harder for 
whiles (relative, that is. to the other 
items) These very high values indicate 
that the test works m fundamentally con* 
sistent ways for both whites aiHl Uacks. 
and these numbers constitute the nmor 
finding of the research to date. They sug- 
gest that, whatever the factors are that 
affect score differences, they are not 
simply a case of pooriy chosen qucsUoo 
types or tome source of inappropriate 
content. Questions are relatively hard or 
easy for the two groups m consistent 
ways. 

But the questions do show some vana* 
tion in their distance from the hne of best 
fit Some are virtually right on it. some 
above, some below. The second step in 
the differential difficulty method is to 
measure the distance between the point 
associated with a question and the line of 
best fit TYiose questions which are 
farthest away are called "outliers," and 
they are selected for further study. They 
are the ones that show differences that 
would not be expected from the overall 
score differences for the groups. Thus, 
they are in a sense inconsistent with the 
total score data. The question naturally 
arises- As a group, do these questions 
have any characteristic in common that 
could explain why they are the farthest 
ones away? 

Generally speaking, the answer to 
date, based on several studies, is "no " 
The position of a given question on such 
a chart is often due to samplmg error, 
and the particular questKMS that emerge 
from a given study are often not the same 
ones that would emerge if the study was 
repeated woth date from a different sam* 
pie of respondents. An inspection of 
questions with large distance measures 
moat often shows them to lack any ra- 
tional characteristic which would explain 
their being **outliers.** Some examples of 



144 



Table !• Amase Distance from Item lo Une for 
FourSAT'terbai Question T> pes 



such quesuons accompany ihis discus* 
$10 n (see sidebar, page 5 ) These are 
lypica). and they are no more plausibly 
related m content lo the stereotyped cul* 
tural differences than the rest of the ques- 
tions tn the test. 

Not only may these disUnces be used 
to identify individual questions, they 
may also be averaged to compare the 
properties of different types of materals 
In the SAT-verbal section, for example, 
the dau m 'Dible 1 emerge from analysis 
of the average disUnce of the four sat- 
verbal question types Table I sum- 
manzes the results of four sat forms 

A minus sign means that the items 
werei on the average, relatively harder 
for blacks, a plus sign that they were 
relatively easier for blacks Because of 
the method, the values withm a test wtlt 
balance, so that if two item types show 
average differences in one direction, the 
other two will show differences in the 
other direction The averages in IMt I 
tend to indicate that analogies and sen 
lence completions were on the dverage 
somewhat more ^iHicult for blacks (com- 
pare*! to whiles) than were antonyms or 
reading comprehension 

These results do not mean that anal- 
ogies and sentence completion are "bi- 
ased against minonties." or that an- 
tonyms and reading comprehension are 
"biased agajnst whites " These are really 
very slight average differences, and no 
appreciable change in score patterns 
would emerge if the test were recon- 
stituted entirely of antonyms or reading 
comprehension items The results arc 
possibly due to sampling differences, or 
to differences m the average difficulty 
level of the quesuon types In general, an 
item type cannot be considered "biased" 
on statistical grounds alone, there must 
be some knowledge of why the results 
are obtained 

The comparison of the item types in 
this manner is a demonstration of the 
general approach to the use of average 
distance measures Using this approach, 
it IS possible to study reading passages of 
different content, or mathematical ques- 
tions which have diagrams associated 
with them, and so fonh The method en- 
ables the analyst to compare the average 
distance of a vanety of interesting cate- 
gones of questions 

One of the pnncipal limits in an ordi- 
nary differential difficulty analysis, how- 
ever, comes from the fact that many 
groups differ in total score level This 
lends 10 make outliers out of questions 
that are more or less sensitive lo differ- 
ences in total score One way around ihts 
difficulty IS to divide the candidates into 
subgroups of approximatdy equal abiN 



QueftMft type S 

Analvvws ^0 
SenleiKe comHetion 60 

Reading comrftheniion 100 

Antonyms 100 



ity In several studies, for example the 
familiar College Board 200-800 score 
range has been broken up into lOO-untt 
bands 200-290.300-390.400-490 and 
so fonh The groups of candidates m 
each range are compared with respect to 
success on the questions Using this ap- 
proach for a question, it is possible to 
show an overall difference between the 
groups but to fail to show any signi5cant 
difference at any of the smaller ranges 
considered When this happens, it is evi- 
dence that the overall difference in per- 
formance on <uch 3 questtcn is a result 
of differences m average score levels be- 
tween the groups About 50 percent of 
all questions showing overall group dif- 
ferences do. in fact, fail to show sig- 
nificance in such "range companson" 
tests when the) are subjected to them 
The remaining questions tend to meet 
both cntena. ihar overall differences are 
sufficiently different to make them un- 
usually distant, and there is evidence of 
statistically significant difference within 
at least one score range For such ques 
ttons the evidence iridicates that some 
factor other than total score levels is in- 
fluencing the result Statistics, however, 
cannot tell us what that focior is. They 
simply provide a s«nal that the question 
must be carefully reviewed if tt is to be 
used at all. 

A useful adjunct to differential diffi- 
culty analysis, used only informally up 
to this point. IS called "disiractor" analy- 
sis The gist of the approach is provided 
in the discussion of actual questions that 
have shown group differences (see side- 
bar) The wrong answers to multiple 
choice questions are called "distracters." 
and. as the discussion shows, ethnic, ra- 
cial, or sex-defined groups of different 
composition can be meaningfully con- 
trasted with respect to their patterns of 
response to these various alternatives 
The method works best for samples of 
equal ability, because differences of abil- 
ity can introduce "artifkul" differences 
(See the example of Runner Marathon ) 
But a distracter analysis, if it shows 
simitantyof patterns of responses, offers 
confirmation that the internal solution 
processes for the groups are reasonably 
equivalent That is. if the nattems are 



16 I 29-* I 02 

♦ 08 -0 92-+i 17 

♦ 11 -1 0V-f094 



Similar there is not one "minonty" men- 
tal process and another "nujonty" pro- 
cess Blacks who score well do so in 
ways Similar lo those of whiles who 
score well, whites who do not score well 
do so in ways that are tike those of 
blacks who do not score welt As ihe 
sample items m ihe sidebar show, it is 
possible for blacks and whiles, when 
properly matched, lo show very similar 
processes lb dale. Ihe use of these tech- 
niques for College Board tests has been 
Umiled lo informal tnquinei b> test de- 
velopers who seek lo understand individ- 
ual Items A more systematic use of Ihe 
method however, may emerge for (he 
future 

The sUlislical analyses for test fair- 
ness also consider other aspects of the 
tests besides Ihe companson of the dif- 
fkully of questions For example, differ- 
ences in Ihe characlenslK work rale of 
vanous groups are often suggested as a 
source of score differences In this view, 
minonty students may ran out of lime lo 
finish, leaving large numbers of ques- 
tions u nallempled Accordingly . in doing 
an analysis, a careful check is made of 
Ihe proportions of whiles and blacks who 
complete the sections of the test For a 
set of samples matched on ability, the 
average percentage of blacks completing 
an SAT- verbal section was about 9 per* 
eeni less than the average percentage of 
whiles The average percentage of blacks 
completing an sat- mathematical section 
was about 4 percent less than the aver- 
age perce-f:'»ge of whites These are not 
very large differences Nor were there 
great differences in the number of blank 
questions at the end of a test On the av- 
erage, for SAT'verbal sections whites left 
1 2 of 40-45 items blank, blacks 2 8 
items For SAi- mathematical, whites left 
I I of 25-35 Items blank, blacks 1 7 The 
differences between the racial groups, 
then, are not large by either yardstick, 
"percent completing the test" or "total 
Items left blank " They suggest that 
while a somewhat greater number of 
blacks may move somewhat more slowly 
through the test, the differences are mod- 
est and not likely to produce sizable 
score effects 



7 



1 ^ O 



145 



Tilt CoatinainK Effiorl 

BoUi the review processes and the 5U- 
tntical appronches are coatinuaJty under 
development "Sensitivity" issues de- 
mand a current understanding of the 
viewpoints of the ouididates ind a con- 
stant effort to perceive the test from the 
perspective of individuals in the signifi- 
cant groups. On the cutistjcal side, there 
«8 work underway on more powerful 
methods that apply matbematical models 
to the evaluation of differential item ncr 
formance. TTiese more powei^ statis- 
tics are more useful tfun current meth- 
ods when troups differ in ability, and it 
seems likely that these differences be- 
tween groups are virtually unavoidable 
in day-to^y operations. 

The conlinuini effort to produce a fair 
test hi.s been rewarded so far with excel- 
lent reports from reviewers and analysis 
but the programs must continue to de* 
velop newer and better tools as they go 
along. Questions of fiumess are a very 
important aspect of a test, and they must 
be asked repeatedly, foras thecandnJale 
group changes, the answers to questions 
of fairness maychange. In the meantime, 
the evidence developed so far supports 
the conchision that the sat is not tikety 
to be unfair to any group • 



About ihe Author 

Thomat F Oofiion h program rttnnh 
scifHhji at Edmoiionot Tntmft Srntcf 
and Ktwij matidy m the arra ofColltf-f 
Biiard proiframt 



146 

Mr. Edwards. Thank you very much, Miss Rigol 
We now will hear from Dr. Carol Dwyer, who is executive direc- 
tor for test development, at ETS in Princeton. 

STATEMENT OP CAROL ANNE DWYER 

Dr. Dwyer. Thank you, Mr. Chairman. Good morning. 

I have worked at ETS for 15 years as a test developer. I am 
trained as a psychologist and consider myself a measurement spe- 
cialist. I have had a lifelong interest in gender and achievement. I 
am very happy today to have this opportunity fx) talk to you about 
these issues. 

I would like to start by saying some things about standardized 
tests. Historically, one of the main objectives of the development of 
standardized testing was to set comparable or standard tasks for 
everyone to demonstrate their knowledge or skills. Part of the aim 
of this was to give comparability. We have heard this morning al- 
ready that a grade of "A" from one teacher doesn't necessarily 
have the same meaning as a grade of "A" from another. 

But another part of this development was specifically to improve 
fairness. What standardized admissions tests replaced were criteria 
that are unacceptable by today's standards. For example, family 
connections, the possibility of large financial donations to an insti- 
tution, one's religion, race or sex. Then, as now, I believe that the 
alternatives to standardized testing are very poor, indeed, and are 
more so for disadvantaged groups. 

I would like to talk a little today also about women and their 
test scores, with a focus, as most of my predecessors have done, on 
college admission tests. 

I think we would all agree that women and their roles in educa- 
tion are changing, and the tests, I believe, can give us some very 
important information about this. But what do tests tell us now 
about these changes? Dr. Rigol has already mentioned the changes 
in the group of people who choose to take the SAT over time. 
Women are now the majority of that group, approximately 40,000 
more this year than men. But we also know that these women who 
choose to take the SAT are a less privileged group, academically 
and otherwise, than the men. 

My written testimony has gone over some of the facts of women's 
scores ^^^lative to men, and women's scores now as opposed to in 
previous years. Dr. Rigol has also alluded to this. But the decline 
particularly in women s scores on the verbal test has been so much 
discussed recently that I would really like to say some things spe- 
cifically about it. 

I feel there are two principal reasons why we have a decline in 
women's verbal scores relative to men's. The first— and I believe it 
accounts for the large proportion of that change— is the population 
change that Dr. Rigol just mentioned. Women who before would 
not have aspired to college are now taking the SAT, and that's 
good news for thone women and good news for women in general. 
But it is not good news in terms of the score average. 

But, based on much broader evidence than the SAT, this score 
decline is just one part of a consistent trend in a much larger pic- 
ture. A very important piece of work has just been completed by 



150 



147 

ig?y*=^°^°?^ts and measurement experts. Dr. Janet Hyde and 
?f University of Wisconsin and the University 

Of Cahfornia at Berkeley, respectively. They havu completed a 
^orifjf v?^^' "^^.^ quantitative review of 165 studies of 
Toe 'S.f''*^**^^:?.*^^ patterns of sex differences within these stud- 
ies, ihey reached the conclusion that there is no overall sex differ- 
ence m verbal ability today, and they believe that that overall sex 
1^74 ThL*^''* previously existed disappeared around the year 
TL.l^A^ni^'' differentiate the studies before 1974 from those 
a^rward. There was a small advantage in favor of women before 
Ixit' !l • ^ IS mostly what sticks in people's minds, and 

SJS an'^erof"?SL^^ 

i« IJSii"**-"^' "^^f a very significant one in this field, 

IS perfectly m accord with information that we have ^''rom admis- 
«lf fi,??"^ which are typically volunteer samples, such 

as the American College Testing Program and the \Cr, but also 
information on the population in general that we »ave from very 
gj^sources such as the National Assessment of Educational 

nn!.jS°S**fu^° almost parenthetically, that around this same 
f^Ii^J^u- E early seventies, m a number of achievement areas 
that we think of as being traditionally female, such as foreign lan- 
fnaage learning, women also lost their advantage to men. I think 
that IS a significant piece of educational information. Frankly, we 
dont kno'A' why this occurred, or why the p^.uod around 1974 

f.wlfvi'.^'^'^^'r*- «»swers tJ these qu^ 

tions have to go beyond the tests themselves because, for one thing. 
^J^^^e tl^fse. studies exactly the same, unchanged, tests were 
dif nt5""f; ^i- 'nagic period, so thaUt really 

?h"^SlL'thrhi\Tanred" °' 

women^SSS^S'a? ^ ^ 

Dr. DwYER Yes. they are definitely a different population of 
ZTn,?!'* ^ ^ ^^^^ women changed their bSiavior A 
the same time they stopped taking the less traditional stuff, they 
began taking more math, more science. They began taking more of 
tl^wrl^- "nontraditional" areas. I think there f^aps 
ofX^5o?.,*H"" but I'm speculating here, on the basis 

ot the data that has come to light recently. 

PviHpn5nf^K^"°? before us today I think, is are these changes 
evidence of bias. In a word. I think the answer to that is no In 
trying to cpnvmce you of this. I would like to talk about bias in 
teste as well as bias m test use-and others Ufore me today have 
™5?-^* distinction. But I would really like to reinforce it be- 
cause it is absolutely critical. 

»o?i!!f'"®r"'^®^ ^^.^ because people differ. The --rage 

7^\Mr. '"^"' .example, as a group, differ this 

country. That do^n't necessarily mean that the bathroor. cJles 
they stand on m the morning are biased-although some o cer- 
tamly wish that they were. But few would argue that we have com- 
plete educational equality today for womin or formrnoritiS 



ERIC 



151 



148 

Thei^fore, good tests should reveal these differences, where they 
exist. 

It is very tempting to blame the tests for telling us things we 
don't want to hear. But yielding to that temptation is only going to 
lead us to gloss over real educational and social problems. Of 
course, tests themselves can be biased or unfair. But ls test devel- 
opers, we see sex bias as a very important component of our cen- 
tral concern, which is making tests valid. And by valid, all I mean 
is that a test is accomplishing its intended purpose. 

Now, when we speak of bias in test use, the use of tests as dis- 
tinct from the tests themselves can be biased, and I would like to 
give you just one example of this. If you were trying to select 
people for jobs assembling electronic components, and you used a 
spelling test to select people for these jobs, that would be a biased 
use of that test. On the other hand, the very same test showing the 
very same kind of group differences used to select secretaries might 
well be a very valid use of tb it same test. 

These questions about how tests are used, as Dr. Cole stated ear- 
lier this morning, are not primarily technical or statistical ques- 
tions. They are a matter of social values and logic and priorities. 
There are many ways to use tests to predict criteria such as collie 
grades. But the consensus of measurement specialists is that a 
number of prediction systems may be technically sound in any 
given situation, but the right one to choose depends upon your 
goals and priorities, not upon the technicalities of that system 
itself. 

Now, an organization like ETS cannot and should not be making 
these values decisions for institutions and other test users. But we 
do have a responsibility to make tests like the SAT as technically 
sound as possible, to provide technical assistance to the institutions 
and users of these tests, to make recommendations to them about 
how to use them, and to set standards for appropriate test use. We 
do these things. 

Charges that the SAT cheats women by underpredicting their 
grades just demonstrate a fundamental misunderstanding of the 
role of the tests themselves in a prediction and selection system. 
The SAT, when properly used, is a valid predictor of college grades, 
for both men and women. And we must remember, above all, that 
the most difficult questions and decisions that have to be made 
about issues like college selection require values decisions that 
must be made, whether or not tests are used at all. 

Now, the question here is what does ETS do about this when it 
makes tests. It's a very important question, and it is important in 
my everyday life as a test developer. Our whole worklife and the 
complicated system of producing a standardized test is aimed at im- 
proving validity and at eliminating bias. But there are particular 
aspects of this that I think are very directly related to the issue of 
bias that I would like to tell you about briefly today. 

There is a process called a Test Sensitivity Review, wherein 
every question, before it ever appears before a student, is reviewed 
by specially trained reviewers, using documented criteria to elimi- 
nate offensivenesr inappropriate language, and stereot5rpes. For ex- 
ample, we would not have a reading test that portrayed women 
only in domestic roles when it mentioned them. I have brought 



149 



?i S"**"** fuld«lln«i of BTS' Mtwltlvlty rovliw proewa which 
?tu/tnt •''•'^ gooiinSit "f ! 

.A?* ^J"^ •pproioh thii euMtlon riatlitically by antlyiing rtiulti 



*ulV i„ .7 iui"*'*TV.*'?v f^pwna auiflnmtiy to tno quHtlon in a 
way that ii unfUr-that la to lay, an m tvant diffloultythat may 

•UtStlJtf liSit^il^ SL'^l'? that thli wmbSlftK^^ 
Katiifioal information, plui th« judgmmtal information, ia a vary 

^^biff"^''^' rauflh mow powarftil than olSor affi 

in coj^unction with oommit- 

S:bT;teateA^^^ of lnitltS»l 

^. I Had Tntondad to toflyou that alio tho DTS itaff who work on 

SS!-%lrtf'i "MX**' Howivtr, I can't help 

?A . ^S^"*".!!' ^ «t2^ ^Moh quottions 90 In. BoUovo me. 
ih5Jl*f?uSu®? *° ^> and everybody pTtohee in. So who 

i!»?i"i;„iiira^^^ «f 

Mr. Sdwaxoi. Salarlei are approprlatoly equal, too? 

w*^' We have deve oped, with internal 
iSMdSSl. ' ''^•'^""y liapplled to eveiy 

Mr. BowAaM. Everybody .knowi what everybody geta? 

t.£lt«?^Sf3iSjy '^f^J' P«'«»nnei»trMd we do ayi- 
tematlo analyiei. When imbalanoea occur in the coune of a year. 

^ nade. That la ^viewed MnuX. I &n^^^ 

wi? attkSi**" i®"*^JEW?*» development in toat 
I !u°°'!i'^'?^"l'o"* ^ our ataff I be leve are univenally reootr- 
"♦S??j»V'he Aeld of meaaurement in thla ana. ^ ^ 

wJSSA*!!!*' Reaearch Aaaociatlon la meeting In 

mJSW^« ?«!![!fi"i**^"i"* of aeaaionif rf 

toll tlf^aS :SPuf;hllf^l*i. W^'" ^"''^ «v»*fr-t5 
I would like to Anlah by aayliig aomething about the altornativea 
to totting. We do aometlmMhear altornatiW to atSdartkSffl^ 



BtSJ copy AUUBit 



153 



150 

ing suggested, but more often this issue is ignored, as if our diffi- 
cult decisions would vanish if tests ceased to exist. But I can't 
stress enough that it's important to remember that whether tests 
are used or not, would we still have these difficult decisions to 
make about people. 

Testing takes place in a complex social setting and has recog- 
nized limitations, but I firmly believe that no better alternatives 
exist, including the option of not testing at all, which would allow 
race and sex bias to reenter the decision process, unexamined and 
unchecked. 

The alternatives that are sometimes suggested, such as using 
grade point average alone, or letters of recommendation, or person- 
al interviews and ratings, all have reliability problems, validity 
problems, and especially fairness problems that are worse than 
those of carefully developed and carefully used standardized tests. 
Just as important is that these alternatives do not lend themselves 
to public scrutiny in the way that tests do. The focus on personal 
qualities also has historically worked to the detriment of disadvan- 
taged groups. Traditional out-group members, among which I per- 
sonally would include women and racial and ethnic minorities, 
have benefited from situations that are highly structured. This is 
what I think of sometimes when I think about the salary-equity 
model at ETS, which I feel is a very equitable organization. When 
the criteria are clear, when people know what the rules of the 
game are, that is when the disadvantaged groups can get ahead. 

In closing, I would repeat the advice of some of the other pre- 
senters today, that we try to focus on the causes of the difference 
we re seeing, rather than narrowly on the indicators of the differ- 
ences, if we're really going to improve education and contributions 
to society of women and minorities. 

Thank you very much. 

[The statement of Carol Anne Dwyer, with attachments, follow:] 



Btjcwtiw oiqioty tei^ 



•t AlMiriiigfln 
fkiRMM in 0tinte«Uf«l «Mtt 

n, xnr 



ERIC 



152 



Good nominj, Mr. Chamaan and nenbers of the suboannitcee. My name is 
Carol Anne Dwyer. For the past fifteen years, I have worked at the Educational 
lasting Service as a developer of tests. EIS is a lOBasureinent and research 
organization headquartered in New Jersey. Vte are most widely kncwn for cur 
standarxiized adciissicns tests, ipcluding the Scholastic Aptitude Ttest (SAT) , 
which we develop for the Oollego Board, the Graduate Recxird Examination (GRE) , 
the Graduate Managcoent Admissions Test (GMAT) and the Ttest of English as a 
Foreign rar^(uage (TOEFL) , which we also conduct for spcaisorirg boards. I am 
presently in ci^arge of test develcpuent for elementary school, secondary 
school, and higher education testing programs. 

I am a psychologist, a Fellow of the American Psychological Association, 
and a menter of its Educational Psychology Division's E>cecutive Board. I have 
edso served on the Executive Council of the American Educational Research 
Association and have been Vice President for Measurement and Research 
Methodology with that organization. 

My primary professional research interests, beginning with my doctoral 
dissertation at the University of California, Bericeley, have been the fairness 
of tests, the relationship between gender and achievement, and che interface of 
technology and social vedues. I have conducted training activities for AERA, 
APA, and other associations and institutions on bias in testing, and have 
chaired and served on numerous wonens' carndttees for AERA and APA. I was one 
of the founders of AEPA's Special Interest Group on Research on Wcraen in 
Education. 

Understanding bias, and )ajowing how to avoid it, is at the heart of what 
we do at BIS. Faimess is integral to the term "standardized." In every 
ssp&c^ of our woric, frcin the developnent of questions, to the administration of 
tests, to the scoring of answer sheets, to the reporting of scores, and to the 
use of our tests in society, we are involved in the constant pursuit of equity. 
Ihe contributions of EIS to the test bias literature over many decades shew 
clearly that ETS is a leader ir research and developnent in thus field. 

This morning, I wculd liJce to talX about four major issues concerning the 
fadmess of tests. First, a word or two about why we have standardized tests; 
next, the question of "bias" on tests. Then I vrauld li3oe to share with ycu 
scne of the recent trends in standardized test scores for females and 
minorities (which are often mistakenly assumed to be evidence of bias) . 
Finally, I'll discuss admissicrs tests and what we do to ensure their fairness. 



Whv Standardized tests 

Now, about standardized tests One of the primary purposes of 

developii^ standardized educational tests, which have a history in this country 
back to the past century, was to ensure the fziir treatment of every test-taker. 
"Standsordizing" means that each student is exposed to the same or equivzdent 
tasks, administored under the same conditions, in the same amount of time, with 
scoring as objective as possible. These methods overoone problems that would 
otherwise exist in ocnpanng students fran different grades, schools, or areas. 
Without standardized tests, their performance could only be evaluated by 
different teaudvers using different methods, according to different criteria for 
success, and this would create qLjestions about equivalency. For exaitple, a "B" 
from one teacher in one classrocn may indicate mora knowledge than an "A" frcni 
a different teacher in a different classrocn. Or the top class rank in one 
school may represent the same level of achievement as ?n average rank in another. 




ERIC 



158 



-2- 



^J:^^;^ Previous Dtthod. of s«l«ction wk. acntin. tasad^cnluA 
SdZ,r.JSi"*^^?2L~-J' "^"^ '^''l' P«»=tlng equity in adnLsi^ 

both fSS^fjS '^'^ "^^l proven useful to 

MtaiM latwMn th«a. Studmts Unefit by tii«lr ability to select a miiaoa 

X^^^ SL^il^f^Sr' «*«ittlnj rtukrrtB vOog. test pi?OTZca and 
^ifLJS^'^JIS^ ti»t thv (W llfcily to be able to haSelhTwrk 
tequind end thua continue beyond the fir»t yw. 

fo,. i!Si!i2?"J'?^*^**i*" «tai«icn, txMvnr, are mt the only roaacn 

r»jdr»fota — am wm ri^^ — without vtanSatd mmucm, m would^laSr^ 
•MwtiaX data for d^tttmlning »;Mh«r tli^Sc^ hSSiSiii? 

1» a great deal belrer said these 
<3»y. about bias In terta, aid next I'd like to say a few uorisatoutaat?^ 

HffJ^ SJSliL*'^ that a tert i. biased if diffWwt gtoui» of people qet 
^ JS^.rr^i?^^. '**^' <ll«««»e. In artof thiie^ df 

- -ss^^r^ity^isr^ ^ 



ERIC 



154 



-3- 



taken, attitudes toward these subjects, kinds of ncn-school experiences, and 
school grades received. We expect these differences; we eure enriched by the 
diversity that many of them bring to our culture. Vfe are 2ilerted by other 
differences to iaportant prcblens to be solved. Tests are not intended to 
eliminate or disguise these differences; they are intended to identify then, if 
they exist, as accurately as possible, whether the results are judged to be 
positive or negative. 

It is iuportant to distinguish between test results that show differences, 
and the factors that cause the differences. Scales, for cxanple, do not cause 
people to gadn or lose weight. Tarpenng with the instnments to cover up 
differences is tenpting, but dangertxis and wrong. Tests are an easy target 
when they reveal unwanted or unexpected results, but they are the wrong target. 
Changing tests sinply to hide differences in achievement oould lead to 
ignore real problems that should be addressed. 

•Ihere are, of ccwrsc, ways in which tests can be biased or unfair. 
Avoiding bias is central to a test-maker's main concern — that of developing a 
valid test. By "vadidity" I sinply mean the extent to which the test 
aoocnplishes its intended purpose. 

A test itself, for exanple, oould ocnoeivably contain questiOTis that are 
unfeor to a group of test-takers because of offensive language or inappropriate 
presentation of grccp nescbers. it could also contain content that is not 
aocurately representative of the ability being tested or questions that are 
poorly worded or unnecessarily confusing. It is extrenely important that tests 
be free of such bias, and I will tell you later in ny presentation what we at 
EIS do to ensure that our tests are fair in all re^jects. 

It is also possible that a particular use of a test, rather than the test 
itself, may be biased. Use of a spelling test to select people for ^cte that 
require no spelling — such as assonbling electronic parts — is a biased use. 
Ihat same test used to select secretaries may be perfectly appropriate — even 
if the average scores of the secretaries and the electronics asseaislers are the 
same. Potential bias can edso occur when test scores are used to predict 
perfonoance on an inappropnata criterion measure (i.e. , an outcane we vrould 
li3ce to predict, such as class leaidership or future income) . ihis can occur if 
the criterion measure itself is invalid or biased for certain groups, for 
exanple. Tests can also sinply be used for the wrong reason. 

Hew tests are raDSt equitably used in society is not primarily a technical 
or statistical question. Test makers have a responsibility to supply technical 
assistance, make reoonraendations, and set standards of good practice for the 
services they supply? but fair test use is a question of values that goes 
beyond the test itself and its maJcers. Ihe purpose of testing and the best 
strategy for dealing with any group differences should be defined before any 
use is made of tests, if a stated polir,* goal is to increase the nuai)er of 
minority nurses on a hospital st^lff , for exzirple, a racieilly balanced group of 
trainees might be selected frcm a pool of qualified tplicants all of whom 
passed a nursing exam, rather than being selectaa surply in rank order of their 
test scores. Or if a college admissions staff's primary need is to predict as 
precisely as possible (withoit over- or under-prediction) the performance of a 
group of applicants* first yeeu: grades, they could use estimation procedures 
that wiJ I maximize that precision, validity studies provide valuable 



ERIC 




IS'5 



Irgnds in Soom f?tffoy^,>^ 

ocn»J£S ITL.'l^l^ on the SAT verbal section have also declined in 

intho difference cn the NAEP st^ients' 2 Lo^loX''^^ 

increases m ren's sabres, rather than a decOine in oJ^^ ^ 

Studies nw in progress show that one «ajor cause of the decline m 



ERIC 



156 



-5- 



wocen's average soores cn adiaissic i tests relative to men's are denograpiuc 
charges in the self-selected grou^ of people who take the tests. I^e most 
laportant of these is that nany ac -e waaen are now taking the SAT than ever 
before. Wiereas wanen oonstitutec cnly 44.5% of the test-takers in 1965, now 
at 52%, they have beocoe the m3ority. This no doubt neans that core wxen are 
aspiring to higher education. How3»ver, there is evidence that these woaen on 
the average are not as wall pzrepared acacknnical ly as the wcnen who prsviously 
took the test. Therefore, theor niean soores should not be expected to be as 
high as those of their predecessors. The net effect of this is that when the 
"new" group of women is inclvxied in the score average for all woaen, the 
average goes down. There has been no corresponding trend for ycung male high 
sciK»l graduates. 

Wfe are also investigating the possibility that changes in test content 
could have ocntributed to the decline in waaen's verbal scores. The amcunt of 
science reading in the SAT changed during the X970's, for exanple? however our 
initial research does not indicate that the dates of these changes coincide 
with the dates of the observed score changes. ^ 

The ACT Assessanent p iogi ai a is the other large college acfi3!Sion testing 
program that, like the SAT, tests over a million students each year. Users of 
the ACT and SAT tend to be clustered in certain regions, with those using the 
ACT concentrated prinarily in the midwest and the southern region. The ACT 
Assessment tests college skills sanewhat dif ferenay than the SAT, but the 
general trends in itales* and females' scores are highly similar in both testing 
programs. ACT also has experienced a growth in the proportion of vromen taking 
the test and has also seen evidence that the vranen taking that test have had on 
the average fewer courses in math and science than the male ACT test-takers. 

Many of the issues that I hove discussed today with an errphasis on wonen 
are issues for racied and ethnic minority group members as well. Me shculd 
eOso renenber that these are not separate categories: very substantial nurrtoers 
of test-takers eure minority wonen. 

Very often minority group mentoers score Icwer on tests t^ian the majority 
group. It is generally observed, for exanple, that Black test-takers, 
regartUess of sex, score well below Wiite test-takers on many educational 
tests. The magnitude of the difference between Black and Wiite candidates* 
scores is larger than all but a very few gender differences. Hispanic 
test-takers as a group, tend to achieve soores soiaewhere between those of 
Blacks and Wiites. Asian-American test-takers, as a grccp, excel in 
nathanatics and science tests, but do less well than majority group members on 
verbal tests. Agzdn, ncne of these differences in themselves indicates bias in 
the test, but nay sinply reveal continiiing disparities in the education of 
minority students of all ages. For exanple, we knew that Black and Hispanic 
students are less likely than Wiite students to be enrolled in an academic 
program in high school. 

These broad generalizations hold true on ira3or admission tests such as 
the SAT and the ACT Assessment. Hcwever, there is sane enocuraging news. A 
nuirtser of statistics froa admissions tests, large-scale longitudinal surveys, 
and the National Assessment of Educational Progress suggest that the gap 
between majority and miiwrity group soores is narrcwing, particularly in 
reading. Different tests show dif fcrerx»s in the rate of this progress but the 



ERIC 




157 



crverall trend is dear, 

theroatics represents a special problem area for both ^men and Black 
test-takers as a group. Black students, liJce woaen, tend to take less 
ccursework in mtheroatics than naijority males and to be undenrepresented in 
hig^level math ocurses. This is, not surprisii^ly, corcelated with their 
mathanatics test scores, and is an ijuxartant area where further affirmative 
efforts to increasa wonen's and minority group nenbers' participation in 
nathematics and science activities aro greatly needed in order to inprove their 
academic and CKployment options, ^^^e uwiir 

Ensuring Fairness in T\este 

^•^^ differences in performan» on standarxUzed tests by 

different grcwps have long been observed and are closely momtored by 
edurational researchers and testij^ catpanies, A necessary first step in 
investigating score differences is to examine the test itself for any possible 
bias, I want to take a litUe time now to talk about how we at EIS tr/to 
ensure that tests are feur, 

Today we are focussing cn standardized admissions tests, ihese tests 
are famUiar to many of us because we or our children have taken them for 
entrance tc college, graduate or professional school, These tests have been 
developed by ^jecialized testing organizaUons which adhere to professional 

"^^S,"^ Ihe most recent and cccprehSsive testim 

standartJs were jointly developed by the American I^ychological AssociaUon, the 
American EaucaUcnal Research Association, and the NaUcml CoinSl^ 
MeasurGsnent in Eaucaticn, Eis is ocaaitted to continuing to meet these and all 
other eipplicable standards, 

. ?" addition, EIS, under the leadership of our president, GregorY Anriq 
has atte^ to go beyond the standards of the profusion as a^S^a^^' 
SfLf^^'^ own standards for the quality and fairness of the tests we 
develop, These standards, whicii are set forth in this booklet, meet or exceed 
the general ^fessicna:i, standartis, Qiaiman Edwards, I r*>quek that a ^^f 
th^standards be inserted ijTto the record of this hearlngT In a furthe?^ 
effort to adtoss the dual goals of fairness and quality, EIS has established 
an acoountabUity system of audits of all our testii^ programs. We have also 
invited numerous panels of distii^uished educators and other ^ialists to 
critique cur practices and to occnient on them publicly. 

We believe that cur admissions tests are fair, as fair as am-one )awws 
how tomake them, and that they are fairer than alternatives such as interviews 
and letters of reference. Among the many st^ taken to ensure the accuracy 
and cyjality of the tests we develcp, two are e^ially irportant in ensuring 
racial and sex fairness: the "Sensitivity Review*' and the "differential it^ 
functioning" process, which I would like to describe briefly. 

First, every question in every test developed by EIS must underqo 
scrutiny by ^»cially trained sensitivity reviewers who follow rigorous, 
do^mented criteria designed to identify questions that may be called biased 
bemuse of inappropriate or offensive language or content. The reviewers also 
check to make sure that the test is appropriately balanced with re^Dect to 



ERLC 



7A-668 0-89 



158 



representation of people in different groups and in different roles. For 
exairple, wa wculd nn rwi f ter it unaooeptable to have a test of reading 
oxiprehension that siioed wanen only in domestic roles. I would lijce to have a 
ocpy of an oversdck of our Sensitivity Review Guidelines included in the 
hearing reoord, Mr. Chainnan. 

Further, EIS has developed and is in the process of introducing 
cperationally new statistical measures of potential bias, or "differential item 
Auctioning." Ihe tasic idea behind these statistics is that people who know 
^^)roodfflately the sans amount about the subject bcinj tested by a question 
should have similar chances of answering it correctly, regardless of 
differences in their race, sex or ethnic background. The statistics therefore 
first natch two groups of people in terms of their relevant knowledge and 
skill, then coDpare t h ei r perfotnanoe on each test question. Hiis gives us a 
measure of a test question's "differential difficulty." ihese statistics will 
thus help to identify differences in performance that may reflect potentially 
inai^appriate characteristics of certain test questions. Such statistics will 
be used by eOl the major yiujidi i g; for which EIS develops tests. Hie 
ocobinaticn of statistical analysis with thorough and detedled professional 
reviews of all (jiestions provides a much stronger guarantae against potential 
bias than would either metibod used edonz, 

1 should also mantion that one of Eis's basic cxx|xjnents in the test 
develqfinent process to ensure test validity is the use of cormittees of 
educators to plan and develop tests, "niese ocnmittees are oaiposcd of subject 
natter experts, usually teachers or urdversity professors. The ccomittees 
include Mcmen and men and minority ard majority group menbers from 2ill parts of 
the country, aH types of educaticna] institutions, and all specialities within 
their disciplines. Ohey bring a broad perspective to the material included in 
cwr tests and help ensure its accuracy mese oannittees work with an EES test 
develcpnent staff nada up of 86 woaen and 46 sen. 

EIS has a long history of contributing to research on test fairness and 
making the data we oollsct available to other researciiers. Ihree current 
studies, funded by the College Board, are i^rticularly relevant to today's 
topic. The first is a oonpleta ou i iUaiL hiiitory of all the SAT tests 
administered frat 1960 to 1987, telling xjs exactly what was tested on the SAT 
and when. Wa can then exzonine over the years whether con t e nt variations did or 
did not coincide with group score changes. {As mentioned earlier, the chaises 
in test content in the 1970's do not appear to have coincided with the dates of 
observed score changes.] Another study will use tlie "differential item 
diff iculty" tech nique that I just described to examine SAT verbal questions to 
see whethfu: content factors (such as science contexts) are responsible for 
score differences fbcr men and wqbwi who are otherwise oonparable in their 
overall verbal reasoning skills. A third study wiU expand our knowledge of 
the donographic characteristics of the women and men who take the SAT and the 
relationship of these characteristics to their SAT scores. 

Fairness is also inportant in how tests are used. It is the job of 
testing ooipanies to produce the best tests possible frcin a technical point of 
view, and to provide interpretive material and sound technical assistance to 
their clients and users as they decide how to use test scores. Adadssions test 
results, obviously, are intended to enhance the eqaity and efficiency of the 
college selection process. Decisions about the use of test scores 1^ collages 




ERIC 



159 



-8- 



do not ocxur in a vzaue-free ocntext and are not under the direct control cf 
EIS or any other agency. 

Institutional and sccietai priorities are brought to bear on statistical 
data, A better gacgraphiced mix of students, for example, may be desired in 
the new first-year class at a small college in a Great Plains state. A larger 
nunber of othnic mincnrity students night be sought by an institution in the 
Pacific northwest; or a large, predaainantJy fenale first-Tear class nay be 
scwght by a formerly all-male private college in n&t Qigland which has recently 

^" ^^PB"; Each of these colleges will and should mato its own 
value judgments, acootding to its cwn priorities, as to how to use test scores 
ec^tably in the admission procsess, mis was the view taken by the Haticnal 
Acadeny of Sciences' Ocnmittee on Ability Testing in 1977, which put it better 
than z can: 



"Even recognizing the inherent difficulties, we 
believe that acbiissions officers have to exercise 
judgment, case by case, as, in fact, many now do. 
The goal should bo to effect a delicate balance 
among the principles of selecting applicants who 
are liJcely to succeed in the program, of rooognizirq 
exoellenoe and of increasing the presence of identi- 
fiable underreprasented subpopulatiors, " (P. 196) 



Mr, Chairman, in closing I would liJce to sumaarize the major points I 
have made today: 

o carefully developed standardized tests are marc fair than 
the available siltematives, which frequently rely on 
subjective perscnal judgments about groups and individuals; 

o without tests we woild lack basic information about how well 

educational programs are working — information that is essential 
if WB are to focus cur resources on educational iirprovaaents at the 
state and national level that will be most beieficial; 

o score differences exist, but by thenselves do not ?nean 

bias on tests; many factors contribute to such differences; 

0 SIS, a leader in research on testing and test bias, uses 
processes for developing standardized tests that are thorough, 
czurefol and designed to make car tests as fair as possible. 

1 than/- ycu f or the opportunity to sj>(iak to you today about an issue that 
IS near and dea^ to cy hwart, I will be glad to answer any questions you and 
the ocnmttee Kay have. 



160 



EXPERIEHCE 
1983-prctant 



1982-1983 



1976-1982 



CAROL ANKS DWYER 



EDXATXONAL TESTING SERVICE: Extcutiv« Oirtetor, 
School end Hightr Educition ProgrMt (SHEP) Ttit 
DrvtlopMDt, AdainUtritivt Ht«d of coabintd Ttat 
DvTtlopMitt ■»•, including th« following diparCfflinta: 
Seitnci, Ungu«|tt, Ugil Projtcti, Hitht«itiei, 
Littrituri 4 Vritinf, Vtrbal, Reeaonios 4 HtMur«stnt, 
Education, Socitl EtudiM, Dtvtlopaental MathMitic* 
Mid Rtadlng. 

BDWAHOmL TSSHHC SE-^VICB: Dtputy Oirictor, School 
•nd Bightr Educition Profmi (SHEP) Tt»t Dtvtlopatnt; 
Dirtctor of Adaitiioos Titt Dtvtlopitnt. 

Deputy Adiininrnivt heid of cabinid SHEP test 

dtvtlopatnc ■rti, includin* Achxivtatnt 4 Ctrtifieiiion 
Ttat DtvtlopMnt and Adaitiioat Tait D«valos«tnt. 

Adainiotritiva hiad of Adalaaiona Tait DtvalopMnt, 
^ich includaa tha foUowins groups: Mathmtica, 
Litaratura 4 Writing, Verbal Aptltuda. and Raatoning 
4 HtatoTMnt. Tht Achitvanant 4 Cartification Ta»t 
Drralo^ant araa Includac tha following groupt: 
Educational Procaaaas 4 Dtvalonnantal SkiUa, Scianca, 

lad Laneuigei* 

EDUCAnOBAL TSSTIMC SERVICE: xaat Dtvalo?tt«nt Croup 
Baod and ProgrM Dirtctor, Elwantary and Sacoodary 
School ProgrMa 

AMniatrativa haad of tatt davalopnant unit of Xlaaantary 
■od 8«coodary School Prograaa 

Director of devilo^eotil tnd operitionil testing 

prcgrMa (l«aic Skilla Aaaaanant ProgrM, BarwidA 
8«coQdarj School Cartificata Progrtmsa, Oalawara 
Aaaaaaaant Progria) 

Davalopar of aubjact mattar ax«ainat ioT& in psychology 
for tb« Craduata lacord Xxttination Board (ORE) and tha 
Colltga Board 

Mambar of ITS adviaory boarda and covaittaea for: Profr« 
ra»««rch policial, c«ndidata aiiconduct, prosraa 
policiaa, norwing atudifti, taat Analyaai, voatn^a 
affaira, etatittical analytaa of taata. itan analyau 
proccduraa. uta of cut-acoraa, partonn»l claaiification 
«d CMip«naation, r«gional offica pUaning, prior 
raviaw of raavarch, controvertial ittuai in testing 



ERIC 




161 



CAROL ANKS omR (continu<>d) 



1974-1979 Cc«<Jjut«nt Ficulty: Child Piycholoty (ufldirgnduite) j 

CounitUnt tnd Tiitini (grtdutte) trenton Stttt Colltgt, 
Trtnton, Ncv Jtrity 

1972-1976 EDUCATIONAl TISTIKC SERVICB: AtiocUte Extaintr. 

Slaitnttry cod Btcondtr^ School Progrwa 

Coord initor end jrriairy d«vtlop«r for atjor titting 
Vrograat (Stcondtry Schoolt Adttittiooi Ttitt, Movt 
Scotli cducttloBtl AfiiiM«at| OretoQ Stitevide 

A«tttMDt&t, lUrriiburg (PA) Eirlr Childhood Unguagt 
AiittMtDt projtct tnd vorkahopi) 

Intttuctor, InttDiivt Rttidtnt Couratt (Progr«a» of 
ContiDuiag Iducttioo) 

A«tttM«ot Cooetrnt id Strly Education 
Aattttsant tod £v.iluttioD in Bducitional Planning 
Xvtlutcioo of Ptr£oni«nct-8titd Ttichtr Kducttion 

Criurion-Rfifertflcid lai Objectivti-ltftrtnctd 

Mttturaetnt 

a«porting, Intirpritlag. tod Oiing Tttt MtuUt 
Crittrion-rifirtnctd ■■■■JMant o£ bttic ikillt 

Adsiaittrttivt director of Urvudt Stcondtry J'.ftjol 
Curtificttt Prograne 

CoDtultin|: 

Cl*vAt<od Boird of Uuc^tion 
Nadoimi CoUtgi 

Nttioctl Inttitutt of Eduettion 

Harylt&d Dip«rCaiot of Educttion (tiitaMtnt for 

Account ability) 
Dtltvtrt D«p«rtiint of Public Inttruction 
S«z«udt Minlttry of Rducttion (high ichool grtduation 

rtquirMintt) 
Homot Inttitutt (ntodi ■■■■•Mtnt «ad TitU IX) 
Wallttlty Ctnttr for Ufinrch on Wontn 
Vttionil C«tholie Educational Aaaociation (taating 

outcomai &f raligioua education) 
WilCOniitt RtMarch «nd OtVtlopBant Cantar for 

Cognitive Utrning— Univartity of Viaconain, Kadiaon 

School Ptychologiat Murray School Diatrict, Dublin, CA 

1970-1971 Clinical Piychology McAulay Hauro-paychiatric Inatituta, 

Prcctieua (intarn) st. Hary'a Hoapital. San Franciaco, CA 

1969-1971 lUaaarch Aaaiatant ETS BarkaUy tttgional Offici, 

Btrluilay, CA 

1969-1970 Maaarch Ataiaimt Dr. N. M. Uabart, Univaraity of 

Caiifornia, s^rkatay 



1971-1977 



ERIC 



162 



CAROL AHNE OVYER (eoQliButtf) 



1969-1970 



NTchologictl «nd 
educational taiting 



|{««d Start evaluation, Berkeley, 
Unified School Oiatriet; 



Stanford Dnivariity School 
Kithoiitici Study Group; 



Far \fttt Laboratory for Educa- 
tional Reaeareh k Developseat 
California School for the Deaf* 
Berk«Uy 



EDUaTIOM 



Univaraity of Chicago 



1976 



Induatrial Ralationa Centar, 
Buner Kanageaent Devalopaant 
8e«inar 



Uaivariity of Csllfomiit, 
Baikiilay 



1972 Ph.D. Educational Pi7chology 



Univeraity of C«liforcia, 
Berkeley 



1970 H.A. Educetional Psychology 



Bimrd Colltg«, 
Columbia Uaivtrtity 



19^8 A.B. Payehotogy 



BtlECTBD PUBLICATtOHS 

Sex Equity froa Eerly Education through Poataacondery. In Achievipg 
Sex Equity through gducet ion , g. Klein (Ed.). Baltiaore; John Hopkina 

Uoivertity Preii, 1985. 

AERA Guidelinea for Bliainating uce and Sex Bias in Educational Reaearch 
and Evaluation. Educationel Rai'yrche' f , 1985, 14. 16-17. 

Technology tad Teetlng: biplicationa for Velidity in the Coaputer 
K"'^ Education al Heeattreaent; laeuett end Practicea (in preaa). Original 
veraion preaanted et 7ifth Internet lonal SyvpoaliM on Educational 
Teating, Dnivertity of Stirling. 

Equating the Sttnderde of Sducationel Exftainatlona in Two Countriea. 
Britiah Jo urnal of Educetionel PaycbolotY (with R.J.L. Hurphy, tn preaa). 

Sax Biai Md Ridding Teati. Paper preaanted to the International »aading 
Aieocietion Annu4l Heating, May, 19fA. (with K. C«rrit«) 

Chair and preaentjr, APA oiviiion 15 Recent Scholera Awerda. American 
Paycholgoicel Af«ociet ion Annuel Meeting, Anehcitt (CA), 1963. 

Equity in e Cold Cliaete. Educeti>jnel Reeearchcr . 1983, 12, R-17. With 
3.K. Biklen, L.8. Koeater, D. Pollard, j.p. Schauneaan, C. Shakeahaft. 





163 



CAROL ANNE OWYER (continued) 



EncourJsini CirU «nd Uo««n m Hathtaatics (book r«vicv) Tha Ptyeholoty of 
Woman Quartarly . 1983, 2. 3S5-387. ^ 

Achitvwant Teatini. Invittd raviaw ci tchitveaent ttating for Encyclopedia 
^^Educational K.a, arch (Fif th edition) . Ne« York: Kacoillen anj f ree^jraa a 

Keviev of J. Stockerd, P. A. Schwuck, K. Ka«paer, P. wUliMa, 8. t. 
l9ta'l^^ **• Sex Equity in Education Coateaiwrary Education . 

AWA Invited Treining Saaaioo: Xiaa end teetiog. A2IA 1982 annuel 
•eetini (with J. &«:heunaMii) . 

Orsanixer and Ch*ir, invited 8tate of tha Art ferine. iMrican Paychological 
A.aociat^oQ annul! Meting Auguit. 1982 (»lth Anna Anoitaai, Robert Ebel, 
and Sttuel Maaick). 

AaaeatBent of Young Chil«lren. Invited vorkahop. Jnternational Council 
Of Paychologieta, Univeraity of Southaapton (England), 1982. With W M 
KcPeek. 

The Kole of Schoole in Developing Sex Xolea Atticudea Chapter 12, in 
J. Downing, at al (Eda.) jex tola Attttudea and Cu ltural Chanta. 
Dordacht, Holland: D. Uidel, 1981. ■ 

Teet devalopvent for adeptive teating. Proceedinta of the 23th Annual 
Conference of the Mi litary Teatiat Aaaociatlon , jail, ypi. n. A301»iai2. 

Training and Eaploymant Ixperiencea of Educational Paychologiata. Paper 
preaented to Northeaat Educational lUaearch Aaeociction, October 1980 
(vith Janice Sebeunman). 

Equetini the Stendarda of Educational ExMinationa in Two Countriea. 
Paper preaented to tha Fourth International Syapoeiua on Educetional 
Teeting, Aotvtrp, Belgium. June 1980 (vith I. J. I. Murphy). 

Criterion-Referenced Teatng. Invited workahop. International Council of 
Paychologiata, Univeraity of Bergen (Morvey), 1980. With W. M. McFeak. 

VaUdatioo of Perfomence Stendarda. Paper pteeented to the fourth 
iSeo^iitrc^L^^ild)"" °" ^''^■ti^'l Teeting, Antwerp, &el«iiiB. June 

8ex biaa in aelection. In t. J. Th. van de Uap. W. F. Ungerak. and 
D. M. ». da Cruxjter (Ida.), Paychoeetrica for Educational Deba ta.. 
Chlcbeater. England: John Wiley & sone. Ltd, (with (i. I. wild) 

The lole of teeta and their content in producing eppe;«nt aex-ralated 
difference. In A. C. Peteraen and M. A. Wittig (Eds.) The DevelowMnt 
gf 8ex-Rt Uted Differencee in Cojtnitive Functionjng . New York: Actdesic 



ERLC 



164 



CAROL ANHE WYEX (eontioutd) 



lh« rolt of tchool* in drvilopios ttxTole tttitudtt. Ptj>«r prtitnCtd to 
thi World Consrtst on Htottl Kttlth, StUburs (Auitrit). July 1979. 

Miniffltra Coapatancy TtitUg: Probltflt tnd SolutioDi for the Eight iti. 
Synpo»iua on Kinimia Coap«ttDcy Ttacingi TtnpU Dnivtrticy: Phil tdtlphit, 
Occobtr 1979. 

StCdns daftotibl* porfcraanct iCtnd«rdt (workshop). Phi OtlCt lUppi 
lt«dtrthip Coofortnct: Cltvtlaadi Ohio, Karcb 1979. 

Mlnintl Co«p«ttocy Tttcios and Matiurtatnt TiChoolOfy, Interchante , 1978, 
uhola itiuAi 

A croti-n«tioo«l tunrty of culturtl txptccationi «od lex rolt ictnd4rdi 

ID re«dins. Jourotl of E<nirch in aeidini , 1979, 2^, 8-23. (With J. Downing) 

A dtbttt on th« propoiitioQ! «dtq\utt flt«iur»tQfc fctchnology exiiti to 
loplanint ftir, tquicablt, tad uttful ninisua caap«ttncy Ctiting progrtai. 
In Ctottr for Applied Perfom«nci Tasting , Proctadinn of tha Kattonal 
Confarancai on MiniM Coapttancy Taatlni. PorcUnd, OR: CAPT, 1978. 

8tx bitt in ttltction procadurtf and itltctioo inttrtainta. P>p«r 
prtiftntod tt th* third Inttrattiontt SyvpotiuM on CduCAtiontl Tttcing. 
Uydtn (Nttb«rliiul«), July 1977. (wicb C. L. Wild) 

Tttt conttnt in Mtbfii«cict ud scitnct: Iht coniidtrtcion of «tx. 
P«p«r prtiftsttd At tht ABtrietn CducAtton«l &«Mcreh A«soei«tioni 
April 1976. 

Tttc conctnc tsd m difftrncti in rtading. Tha Itadini Tt achar, 1976, 
29, 753-757. " 

Tasc Contant and thm dattnination of aax diffarancaa. P«p«r praaaoCad «c 
Aaarican Sdue«t{on«l Ua«srcb Ajaociacion, Waabington, DC., April 1975. 

Sax diffaranfa in raadini; A air»POaiua . (Ed.) W«ahin|ton, D.C., 
National FouodAticn for tba laprovaaant of Education, 1975. 

Conparativa tupacta of mx diff«r«nc«t in reading. In D. Koyla (Ed.). 
Raadini; What of tha futura t London: Ward Lock, 1975. 

Coaparativa capacta of MX diffcrancaa in raadiog. P«per prasenced at 
Unitad Rintdon Uad ing A«aociation. Ormakirk. tngland, July-Auguat 1974. 

lha influanca of childrtn'a aax-rola attndarda on raiding and arithsatic 
achi«v«sant. Journal of Educational Paycholoay , 1974, 66, 811-816. 

Sax diffsraocaa in raiding: An avaluation and a critiqua of currant 
Cheoriea. Ravi aw of Educational Rtaaarch , 1973, 43, 455-467. 



ERIC 




165 



CAROL AHNS OUYER (continufld) 



EDITORIAL COMSUtTANT 

Afflirican Bducatiooal R«ie4rch Journal 
Educational Rtatarchar 

Encycloptdia of Educational Raaaarch (fifth adition) 
Journal of Educational Fiycholo$y 
Ktntal Ksaaurfnlnta Yaarbook 
Quirtirly Ktvitv of Developotnt 
Reviey of Educttionil Reieirch 

Educational Faychologlat 

Journal of KaAding B«hcvior 

Journal of Rtaaarch in Mathaoatical Education 



PROFESS lOHAL ASSOCUTIONS 

Anarican Educational Raaaarch Aiiociation 
HftAbar-aflarsa of tha Exacutiva Council, 1982-1985 
Vici Priildint of AERA for Diviiion D: Hciaureacnt and Reicarch 

Hatbodolosy, 1978-80 
Comlttaa on Lons-R«nge Plannins 1984-85 
Chair I Standing Cowaittaa on tha Statui of Uoaan, 1980-1982 
Chair, CMmlttaa on Reiaarch Cuidalinaa, 1980-1985 
Raviawtr Diviaiona B, d D and H prograaa 
Judge, Diviaiona D and H raaaarch avarda 
SIC Raaaarch on wonan in Education— Prograa Chair 197A-75, 

Ataiattnt Chair 1975-76, Chair 1976*77 
Conaulting Editor, Encyclopadi a of Educational Raaaarch, fifth edition 
Uo«an Educatora, Rtaaarch Av.rd Coapitition Jud|e (I986-1981| 1981-1982) 

Aaarican ?aychological AaaociatioQ 

Diviiion 15 Continuing Education Conmittca, 1982-1 98A, Chair, 1984-1986. 
DiviaioQ 35, ?rogr«» Coomittaa, 1962 
Diviaion 15 Program Connittaa, 1981-1982, 
1982-1983 

Diviaiqn 15 Noninating CotBiaictaa (chair) 1981 

Diviiion 15 Cooaittaa on wonan & Minoritiea in Educational Paychology 
1979-1981 

Rtieirch GuidiUnei Coimittii 1975-1976 

Intarnational Reading Aiiociation (National Cotnaittie Kanbar 1975-1977, 
S«xira and Raadlng) 

National Council on Haaiuraiaent in Education 

Frograa reviawtr 
Intarnational Aaaociation for Appliad Piychology 
Intarnational Council of Piycholos>«ts 



Honor! 



fallow of thft Aoarican Piychological Aiaociation 




166 



ETS SENSITIVITY 
REVIEW PROCESS 



An Overview 



^) Educational Testing Service • Princeton New Jersey 



ERIC 



167 



Acknowledgment 

The onginal procedures for the ETS scnsilivuy review process were developed b> Ronald V Hunter 
and Carole D Slaughter 

Substantial contnbulions to the process have been made b> other inters of earher documents dealing 
with the issue of sensitivity Many of these pioneering efforts, such as the ETS Guidelines for Testing 
Minorities and the ETS Guidelines for Sex Fairness m Tests and Testing Programs, provided much of the 
creative thought and detail contained within this document 

Finally, many ETS staff members have taken the time to review drafts of ihis document In so doing 
they have provided a wealth of helpful suggestions and productive insights on this complex issue 



Copyright r- 1987 by Educational Testing Service 
All nghls reserved 

Educational Testing Ser\ ice, ET-i and'ij^ are registered 
trademarks of Educational Testing Service 
Cc :ca' onal Testing Service i5 an equal opportunity' 
afTJrmativc action employer 



168 



Table of Contents 



Introduction ... 4 

Background .... 4 

Factors Guiding the Sensitivity Review Process 5 

Cultural Diversity ... 5 

Diversity of Background Among Test Takers 5 

Force of Language . 5 

Changing Roles . . . . ... 6 

The Sensitivity Review Process ... 6 

Reviewers 6 

Test Sensitivity Review Procedures .... ... 6 

(1) Preliminary review 6 

(2) Final review 6 

(3) Arbitration 7 

Sensitivity Review Procedures for Other Publications 7 

Review Critena 7 

(1) Stereotyping 7 

(2) Examinee perspective 7 

(3) Underlying assumptions 7 

(4) Controversial matenal . 7 

(5) Contextual considerations. ... . , . .. 8 

Historical domain . . 8 

Literary domain .... 8 

Legal domain 8 

Health domain ... 8 

(6) Ehtism, Ethnoccntricity, and Related Problems , 8 

Additional Information .... 8 



169 



THE ETS 
SENSITIVITY REVIEW PROCESS: 
An Overview 



Introduction 



Educanonal Testing Service is cominiued to the development of tests and other publications that reflect 
a thoughtful and humanistic consideration of all people and that acknowledge the multicultural nature of 
our society In the 1970s. ETS broadened tne review of all tests to ensure that I ) they contained questions 
recognizing the varied contributions that minority members have made to our societ> and 2) there was no 
inappropnate or ofTensive material in the tests In 1980. the corporation, building on the review procedures 
formally adopted the ETS Test Sensitivity Review Process In 1986. this process was extended to all publica- 
tions, including audiovisual matenals and art work The purpose of the process is to ensure that the guide- 
lines, found m the ETS Standards for Quality and Fairness, are mil 

One such test development guideline instructs test developers to prepare for each test, with appropriate 
advice and review, specifications that cover several critical areas, including requircmer.ts for material rcnecl- 
ing the cultural background and contnbutions of major population subgroups 

Another test development guideline requires the review of individual items, the test as a whole, and 
descnplive materials to assure, among other things, rhat language. s>mbols. words, phrases, and content 
that are generally regarded as sexist racist, or otherwise potentiall) ofTensivc. inappropnate. or negative 
toward major subgroups are eliminated 

Finally, an accountability guideline demands the rev lew of publications and other matenals to eliminate 
language or matcnal generally regarded as sexist, racist, or otherwise ofTensive or inappropnate 

Although a substantial portion of the process consists of general cntcria that can be applied to any 
population group, experience has shown that a particularly vigilant efTort must be made to evaluate our 
publications from the perspectives of the following groups Asian Pacific Island Americans. Black Amen- 
cans, Hispanic Amencans. individuals with disabilities. Native Amenuins. and women The process there- 
fore, specifically addresses areas of special concern to these population groups 

Background 



Sensitivity review, required by Educitional Testing Service for all its tests and publications, attempts to 
eliminate ofTensivcness from all ETS matenals Such ofTensiveness tould obstruct the intent of a publication 
" whether a general publication or a test In the area of test development, for example, the impetus to avoid 
ofTensive matenal comes from a desire to ensure that each test is mdecd asking all test takers to perform the 
siime task under the same conditions, insofar as it is possible to do so 

The importance attached to sensitivit) review docs not imply a measurable relationship between mate- 
nal coHM iered ofTensive by some test takers and the scores of (est takers However, matenal that candidates 
consider ofTensive may produce negative feelings thc*t may affect their attitudes toward tests, and hence, 
their tes scores Recognl^lng both the negative feelings that a test taker may have when dealing with test 
materia and the possible cfTcct that ofTensive test matend inay have on '.he test taker s performance. ETS 
has mst tuted a sensitivity review process for tests and other publications 

Th.» sensitivity review guidelines specif) six groups that are to be given special consideration in sensitiv- 
ity rcvif w Asian Pacific Island Americans, Black Amenuins, Hispanic Americans, individuals with disabili- 
lies, N'jtivc Amencans, Amencan Indians, and women The guidelines, however, are general, they can be. 



O i. - u 



ERIC 



170 



dnd «irc, extended to cover matenalb that arc potentiall> ofTensive to the dderl> and to members of olhcr 
groups, including men. not specifically mentioned in the guidelines 

The sensitivity review promotes a general awareness of and a response to 

• the cultural diversity of the United States, 

• the contnbutionb of the various eihmt and minority groups and women lo the hibtorv and culture 
of the United States as well as the achievements of mdivjduals within these groups, 

• the diversity of background, cultural tradition, and viewpoints to be found m the test-taking popu- 
lation. 

• the force of language in scttmg or changing attitudes toward vanous groups and toward women, 
and 

• changing roles and attitudes in United States society 



Since ihe 1960s, the United States has become much more aware of the diversity of its population. Both 
the civil rights and feminist movements have helped increase the visibihty of wonen and people from 
minonly groups Further, this representation has moved away trom stereotypes and has emphasized the 
occupational diversity and cultural contnbutions made by all groups 

Consistent with these advances in society as a whole, the ETS sensitivity review guidelmes specify that 
all ETS publications must include matcnal that reflects the diversity of the test-taking population By 
underscoring the contnbutions of all groups to United States history and culture and by highlighting the 
individual achievements of women and minonty groups in fields such as science, literature, and business. 
TTS tests and pubhcations attempt to maintam a balance that acknowledges the cultural diversity of the 
tcst-*uMng p^nulations The sensitivity review process requires the demonstration of such a balance 



Beta jse test takers are different, a question may carry an emotional charge for one candidate or 
group of candidates that it does not carry for others For example, a reading passage on sex differences in 
intellectual ability, a question on the problems of livmg in a ghetto, or data concerning the presence of 
certain diseases in a given population may very well be upsetling lo some lest takers The sensitivity review 
helps to ensure that m.ileriai dealing with disabilities, gender, or eihiucity is developed 'Ailh care Further, 
test takers may go away from a standardized lest not knowmg that they have given an incorrect answer or 
that they have misread a passage, therefore, offensive statements included as choices for the answer to a 
question may well reinforce the very stereotypes or bias that the rest of the test avoids Such »,hoices r.msl 
be avoided wherever possible 

Force of Language 

With changing altitudes toward vanous groups within the United Slates have come changes in 
the words we use j\'egro, for example, la no longer generally acceptable as a racial group dcifcnptiun. Bhuk 
Anunnin is now the preferred term At one time, people with disabilities were universally referred to as 
handicapped The term used most frequently now is disahled A term such as "settlers and their wives is 
no lunger used because it places women m a category apart from settlers, who arc generally considered 
male m ihts construction, and because it downgrades women s contribuuons to settlement Similarly, the 
so-called generic he,' though at one time considered the corrc t pronoun to use when refernng to both 
sexes, IS now seen as excluding women These and other woros ^nd descriptions that exclude groups or 
perpetuate stereotypes are avoided in ETS tests and publications 




Factors 




Cultural Diversity 



Diversity of Background Among Test Takers 



5 



ERIC 



1 . -i 



171 



Changing Roles 

Significant social changes have laken place in the United States in recent years Family patterns have 
changed, women have entered the paid labor force m greater numbers and in positions they have not 
typically held, members of minonty groups are making important contnbutions to fields from which they 
were largely excluded just a short time ago ETS tests and publications reflect such cL nges, indicating to 
test takers that ETS is aware of social change and of the opportunities open to ail test takers In ETS 
matenals, therefore, job titles that seem to restnct occupations, {firemen, businessmen, siuntmen) are not 
used Further, women and members of minority groups are portrayed as active participants n society and 
appear in a balanced vanety of roles Where a question in a mathematics test might once have mentioned 
Mary Smith's calculations for roasting a turkey, a similar question today might mention her calculations for 
establishing mi&sile trajectoncs 



The Sensitivity Review Process 



Reviewers 

Reviews of ETS publications are conduued by ETS professional staff members uho are trained in 
sensitivity is,sues at two day workshops and pcnodic one-day refresher courses \\hile there are a number of 
reviewers who are women and/or members of minonty groups, membership in such groups is not a prereq- 
uisite, and any professional interested in the process and showing coni^em for equity may be trained to 
administer it 



Test Sensitivity Review Procedures 

The lest scnsiuvity leview process has three components an optional preiiminary review (required by 
some testing programs), a mandatory final review, and an arbitration process 

(1) Preliminary review 

Any staff member who is assembling a test may request a preliminary review to screen questions and 

answers, reading passages, and other matenals for scnsiiivity-related cs The reviewer's recommendations 

are not binding at this stage, however, a preliminary review is an excellent means of identifying potential 
problems early in the test development process, when modifications can be made more easily 

(2) Final review 

The mandatory final review takes place after the test has been assembled and dunng the regular edilonal 
process This review must be conducted, even if the test received a preliminary review 

The sensitivity reviewer, who is always someone other than the person who is responsible for the test (the 
test assembler), notifies the 'est assembler ii nting of an> sensitivit>-related issues the test has raised. The 
test assembler must then address in wnting all concerns i)f the sensitivity reviev\cr In the vast majonty of 
cases, the test assembler and the reviewer are able to resolve the issues satisfactonly When the two cannot 
resolve issues raised by the reviewer, a sensitivity review coordinator meets with them to ensure that they 
clearly understand each other's position If the reviewer and assembler still uinnol reconcile their difTercnces. 
they and the coordinator meet with a test development director, and the foui of them di*<:uss the problem 
question or passage Most issues arc resolved at this point In a few uises, the inatcnal m question must go to 
arbitration 



172 



(3) Arbitration 

Arbitration ib performed by a f»anel of three btaff mcmbcrb who arc outside the test deveiupment areas 
and who arc not involved wiih \hz lest in which the disputed question or pti^s^ige appears 

After examining the disputed matenal. the panel must reach consensus as to whether or not the malenal 
conforms to TTS sensitivity review guidelines and procedures The decMon of the arbitraliun panel is binding 

Sensitivity Review Procedures for Other Pubhcations 

Scnsitivily reviews of ETS publications other than tests are performed b> the editors of those pubhui- 
lions unless the editor is also the author, in which case another editor performs the >cnsiii\it> review Editors, 
like test reviewers, arc trained in the sensitivity process 

As a rule, editors undertake sensitivity reviews when the manustnpl has reached final draft stage, before 
U IS put into production However, editors are encouraged to review copj informally as early in the editonal 
process as possible If a manuscnpt ihal has already received a sensitivity review is changed, ihe sensitivity 
review editor must review the additions for conformity to the ETS sensitivity guidebnes Editors are also 
responsible for reviewing audiovisual pubbwalions and artwork proposed for inclusion in publications, using 
the same procedures described above ETS-developcd software is also reviewed for sensitivity 

Editonal st.ifr bnng sensitivity issues to the attention of the project director The editor then works with 
the project director toehminate questionable or inappropnate inatenal from ihe publication 

A project director who chooses not to change a manuscnpt must reply in wntmg lo ihe editor's query In 
case of further disagreement, the dispute is resolved with the s^ime arbitration process as that used for lest 
maienal 



The sensitivity review training sessions teacK reviewers to evaluate matenal in light of specific cnteria 

(1) Stereotyping 

All -TC publications are reviewed to ensure ,hat their language and illustrations reflect a fjir and 
unbiased allilude toward all people and are free of matenal that reinforces stereotypes For example, women 
should not be portrayed only cooking, maintaining a home, or taking care of children Sensitivity reviewers 
are trained to identify stereotypes specific to each of the targeted groups and are given a list of "caution 
words and phrases" Some of these arc unacceptable, e g , "rcdmen** when refcrnng to Native Amcncans 
Most caution words and phrases (e g , uniierprivileged) signal that a sensitive issue is being addressed 

(2) Examinee perspective 

Test sensitivity reviewers have a particular concern that docs not apply often to reviewers of other kinds 
of publications They must evaluate all questions from the perspective of test takers, who do not nccessanly 
know the correct answers If an examinee must know the correct answer in order to prevnt a quesuon from 
reinforcing ncgaUve attitudes or stereotypes, the question ma> be in violdUon of the guidelines For example, 
a wrong answer to a qucs^'on about Hispanic culture should not reinforce for those who mistakenly ihink 
the answer is nght the sterrotypc of the '1ai>" Hispanic who always puts off work until "manana " 

(3) Underlying assumptions 

While stereotype* are often blatant, underlying assumptions can be extremely subtle Underlying 
assumptions may lead one to mistake aspects of Western culture for universal norms or to misunderstand a 
particular group Tor ance, a publication that refers lo uu afilicted" person 'sufTenng from" cerebral 
palsy reflects the wnter s underlying assumptions about what it is like to have this, physical condition 

(4) Controversial material 

Highly controversial matenal, such as legalized abortion. i> to be included in tests only when it is 
rele\ant to what is being tested For example, a test for doctors or nurses may have lo contain quesUons on 
aboruon, but a test of reading ability should not include a reading passage on this vontrovcrsial subject 



Review Criteria 




ERLC 



173 



(5) Contextual considerations 

Sometimes the use of polcnlially sensitive matenal is unavoidable There are four main areas m which 
this may occur 

• Htsioncal domain In order to measure an individual's knowledge of histor>. it may sometimes be neces- 
sary to quote from matenal wnttcn dunnga pcnod when social values dificrcd markedly from today's 
For example an older passage descnbing members of the Black community may use the term "colored " 
Wnile It is desirable to avoid such matenal when possible, the malcna! must be judged in the overall 
context in which it appears 

« Literan domain Matenal that is designed to measure an individual ^ knowledge of htcraturc or quotes 
from works of literature often contains similar problems For example, a passage may use the so-called 
•genene he * in rcfernng to men and women Again, such matenal must be evaluated in light of the 
ONcrall purpose of the test, 

• Lcsai domain Matenal drawn from legal sources may sometimes deal with sensitive issues For example, 
a law lest question on the detention of atizcns ma> refer to the incarceration of Japanese Amcncans 
during World War I! 

• Heiilth domain Certain t^xaimnations in the health profession require knowledge that may be considered 
sensituc in other contexts For example, it may be necessary to test nursing candidates' knowledge of 
Tay-Saehs di5;easc in Jewish families 

Inclusion of potentially scnsiti\c material depends on '.he content of the entire test or publication 
Given an appropriate context, use of certain matenal may be justifiable 

(6) Elitism, Ethnocentrtcity, and Related Problems 

To climinatu vorKcpb, words, phrases, or examples that may upset or otherwise disadvantage a test 
laker. CTS makes c\cry cfibrl nol to include expressions that might be more familiar to members of a 
particular -.oaal class or Mhnic group than the general population, such as "soul food" and "trust fund," 
unless the terms are defined or knowledge of them is relevant to the purpose of the test Words and sentence 
conslrutliuns thai luuld ha\e different meanings for dificrcnl ethnic or geographic groups are avoided Care 
IS also taken to assess the appropnatcness of dialect, slang, and non-Fnglish words and phrases, such as 
"bairri." ' stickball." and "maven." which tend to be more familiar to ocriatn ethnic, geographic, or other 
subgroups of English speakers 



Additional Information 



The above is an overview of the sensitivity review process If you have comments, questions, or dcsifc 
more infunnaiion about the process, plu sc wntc to the Ofilcc of Quahty Assurance, 09-D. Educational 
Testing Service. Pnnccton, NJ 08541-0001 



8 



ERIC 



174 



1 

i 

ll 




ll 
r 

i| 
1' 


ETS STANDARDS 




FOR QUALITY 




AND FAIRNESS 


1 


Adopted by the Boa d ot Frustoos 



Kk' I (]u( atiofhil h'sting ScrMte • ffiruclon New Jor^oy 



175 



PREFACE 



Educilionol Testing Suvicc Uts) is strongly commilled lo Ihc pnnciplcs of 
openne.ss in lc>ling, public accountability, quality .ind fairness In October 1981, 
the [Ts Board of Trirstees adopted and publicly annoirncod as corporate policy 
the IT'; Standtuds for Qunky ,vid Fairness At the same time, the Trustee*: 
directed us management to maintain a program for monitonng adh^'-^' ice to 
the Staridards and authorized the appointment of a Visiting Committee of 
persons outstdr- hs who were to annually review and report to the Trustees on 
tTS's adherence to the Standards These actions by the Trustees are tangible 
evidence of ns's commitment as a private, nonprofit c»ducaiional organization to 
pubk accountability and to publicly declared standards by which the organiza- 
tion IS prepared lo be judged us believes that the Standards contribute signifi- 
cantly »o the quality and utility of its programs for those institutions and 
individuals us serves 

Compliance with these Standards ss taken seriously at ti The Standards are 
applied lo all [TS-administercd programs Adherence to the Standards is regularly 
assessed through a carefully structured audit process and subsequent manage- 
ment review tvery three years, the policies and practices of each program are 
reviewed by teams of [Ts and outside professionals that are asked to report to 
senior management any instance in which the program does not meet the intent 
of the procedural guidelines The audit is a rigorous process Management then 
evaluates every recomniendation made by the audit teams and decides what 
acli > I, IT .viy, should be taken to address the teams' findings It is only at this 
stage — with the ^ull attention of senior management that considerations of 
s jch factors as c ost and technical feasibility are taken into aci ount in judging 
how to conform lo the Standards 

The [TS Standarcis and tf>eir implementation arc imponant matters to the us 
Trustees To ensure that the Standards are interpreted and .ipplied according to 
the spirit and purpo.se intended, the Trustees established a Visiting Committtvuf 
persons outside [Ts ifir.t is (ompnsed of distinguisned educational leader^, 
expt s in testing and representatives of organizations hat have been cntical ni 
tis in the past The Committee meets ann ..lly with ns staff, senior management, 
and outside auditors and it issues a repuu directly to the Committix? on I'ubliu 
Responsibility of the [Ts Board of Trustees in June of each year The Visiting 
Committee report is published by [is and released in its entirety to the ntedia 
and to any interested members of the publu 

TheiJs Sianciards and our efforts to apply them reflec 1 1 is s deterinmation to hold 
Itself aciountable to high standards of performance and to settirig fiigh stan- 
dards for the products and services [is provicies These eff(/rts fiave been viewed 
positively bv t is st.iit as \ 'ell as the clients we serv(» We take great pleasure in 
noting the first Visiting ( OmmitttH^'s conclusion 




176 



ERIC 



V\'e tind Ms s ettort rji.iint.itn and iniprovo the qualilv and tairne^s ot 
loslini; well loruliitted \Vc know (it no other testing organization with 
an\ thing i omparahle The ns s\ stem ot auditing its w . - rk is an admirable 
t opponent ot Us s tomnutment to piihlu at ( ouniahilitv v e ap[)laud 
Us s irnent to he puhlu K open ahout a< ti\ ities in wlik h the puhlit ( learK 
has a legitimate interest even though i is is a pri\ate organization 
This publication representv a tontmuation i,\ th»s tomniiti-nenl In 1*)H^ the 
Ameruan fdutational Researdi As<;o(.iation, AmerKan Psvi hologital Assutia 
tion and National Count il on Weasurement in f du< ation .nlupled a < omprehen- 
sivere\isK)n t)t the >tiindir(is n>r f(hu,Uiofhil .hid f\\^h^)^(^i^K^l^ Jv^tini; fhe//s 
St.iiuhrds tor Qiu\{it\ ^vkI T.i/r/X'ss had l)een based on the pre\u>us jcMnl 
standards ot these three protessn^nal assi)tiations \\'e have revised the ns 
Standards m order to sia^ in the tortMronl ot measurement and the latest thinking 
ot the piok^ssion These revised //s ^UvuLinh tor QmlitK ,ind f^umv^s. v Hk b 
were adopted bv the us Board or Irusttv^. on A,>ril U) mb. will be reviewed 
taretullv dunng the next year and wtll l)e used m the 1W-87 atidil prote'is 
Following :his trial period the Standards will he reviewed onte again revised it 
ntH essarv and presented U' the Ms Irtjstees tor their tinal approval in 1987 



C^rogorv R Anrig 
President 



177 



CONTENTS 



Introductiof) 

Actounlabilily 

Confidentiality o\ Data 

Product Accuracy and Timelines^ 

Roscarcli and Dc\i'!()f>rneni 

Tests and \tcasurenu>nl Tcchmcal Quality of To^ts 
Test Dev(»[o()inont 
Kst Administration 
lest Si ore Reliability 
Scale Definition 
Equating 

S(ore Interpretation 
feit V'ali(iitv 

lest Uv» 

Public Intormalion 
Glossary 



VII 



5 

7 
10 

n 

14 
H) 
18 
19 
20 
22 

2A 

26 

28 



ERIC 




178 



INTRODUCTION 



vtt 



The (T<i SUind.irds for Quality iVid Ftvrness .ire designed to erisure that ns 
products nnd servici»s demonstrably moot uxplicil cnieria in sjCvin area> (A l)asK 
importance Accountability, Confidentiality of Data Product Accuracy and 
Timeliness, Research and Development, Tests and Moasurenient TeM L)s>e, and 
Public Information The first three sections of the Stand.irds deal vMih issues thai 
relate to all [ts activities the responsibilities ol is to tho^e affected by its 
activities, the rights to and limitations on access to data collected by [is, and the 
control of qual,:y and performance The remaining sections concern issues 
relating to tTS's mam endeavors Research and Development Tests and Measure- 
ment, Test Use, and ^ublic Information 

The us Standards reflect and adopt the 5ranc/.irc/> for EducMiont-il and Ps\ cho- 
logioii Testing \o\v\X\y issued by the Amencan Educational Research Assuuatiun 
(AERA), the Amencan Psychological Association (APA) and the National Council 
on Measurement »n Education (NCME) The ns Standards, however, are tailored 
to ns's particular circumstances and needs Thus, the Standards m.iy not be 
useful to organizat ions whose practices, programs or serv ic ei ditfer trom those of 
ns 

The Standards are comprised of both principles that underi.e ns eftorts tn each 
area and policies that govern decision -making and guide the development (jI 
more specific goals The Standards are implemented by t is man.igement through 
procedural guidelines that provide more detailed criteria for uss diverse pro- 
grams and services The Stanciards are reviewed and rev ised from time to time to 
keep abieast of developments m professional practice and research 

Like the Standards for Edutattonal and P^iychologKal Toit/ni; i>.sued by AERA 
APA, and NCMF, proper interpretatior and implementation ut iiss Standards 
depends on the seasoned judgment of prof».ssional -^tatf These judgments must 
be carefully based on research, [professional experience, and sound reasoning 
The tTS Stanciards are intended to guide and assist [Ts prtjtessjunals in the llexible 
and sensitive exercise of profes'iional judgments, n(jt t(j obviate the need for 
those judgments Thus, if adherence to any procedural guideline is inteasihle or 
inappropriate in particular circumstances, or if good professional pra< tice in a 
{particular instance conflicts with the letter of a guidelines then s(juncJ practice 
consistent with the spirit of the underlying princi[)les and p(jlities, should prevail 

[TS does not have sole responsibility or authority tt) determine how or whether 
these Standards will he implemented in activities t{)r which |)iacttte or policy is 
substantially established by a group, individual, or institution (jth(»r than iin 
These Standards are not mlended to establish obligations un the part or Ms t<' at l 
or intervene in situations where the pertinent responsibility rests prima rilv 
outside [fs However, us d(jes encujr.jge and assist grcjups and institutions m 



ERIC 




179 



implementing iht« Siandards rolalod lo any '>f ihe«r adivities that involve ns 
prodiius or servues 

us nas Ci)mmilled tlsulf lo these Standards and lo a continuing program ot 
researth and development As a result. Ms expetts to expand tfie r».aliti oi 
kno\vledg(> relevant lo its atttvilies and to r)uriure at us and elsewhere the 
de\ef<)pment of llioughtful and sensitive professionals with the skills arui sensi- 
tnity netessary to apply the principles an<i the polmes embodied m these 
Standards 



180 

ACCOUNTABILITY 



1 



Principle 



ii"- tick I WW hdf^t'^- rt^i'pvn^ibilit) for thv i'ttei (i\e stvwtird'^hip ot it^- rrsoivrc <'s i > 
the New )ork Botud o/ Rvf;ent'> which /?.js i^^ui'd /^s (orpurMv ( h,\nvr to the 
^oxemmf* boiirds thdt sponsor tVid n't poln \ lor p tf^rtinis ui s<'rw( in which 
IT\ product'^ or n^nn o iire un'd, to the iiu.n idiuns And < {)ini7)i(tt't"> thnt t{d\ist* 
ivitt} rc-ptH t to dppropruUe poln \ (or its prot^rjms to </?<■ instftiitn>ns and 
ngVDi. /<'s that use tis products ,ind svr\n cs, fwrsi/ns wfut take i h tests 'tind 
parents or guardians ot minor persons!, submit data for w^e by lis or for 
distribution to othvr.s or participate in research and cle\elopment /;r(;/<'i/s 
conduc ted b\ irs, and tc^ the professional ass(n uitions that are ( one vrneci with 
educational and p<*yc hological measurement and lesearc h 



A H'> will furnish ap[)r()prialo int(jrm,itn)n to those lu whurn il is rt'sponsihle s(j tliL'V 
may makf informed, »ndepcndcnt judgments as t<j the efti-t Uvcnus , wiih wfinh 
(is exercise^ il^ stewardship 

B lis w«ll seek, consider, and, as appropriate, act on the views of those 
sponsor use, or are atfe(,led l)y ns pro^ram'^ and services 

C Us will seek advite on if* activities and policies trom c|ualified men and w(jinen 
who are not employed or retained on a regular hasis by lis and who are draun 
fr(;m appro[)riate professional disciplines major [)hi!os<jphies and points of \ieu 
different ^eo^raphic regions, and the major sohgroups wilfiin the relevant 
population 

D I IS will supp< >rt the ac ti\ tties (jf [)rofessiunal assot latu/Os with res[)e( I to ck'\ elop- 
ing and tmplementing prolessional stanciards or codes making available the 
results of c'jrrent work, and fostering peer review ol its activities 



1 Communicate with sponsors by proMding informalion regularly by rep<irtiiig 
program status in a manrier const sten with njntrattual recjuirements and by 
meeting at least annually so that sponsors can 

• evaluate ifs services in terms of Cjuality, timeliness and ( f)sts, 

• transmit lomments or lonterns <>n wfiu h lis will take pr(HTii)t and ap{)r(fpri- 
ate actions, and 

• expre'^s opinions al)out their program and li*- servites dire<t!v to senior irs 
management 



Policies 



Procedural Guidelines 




ERIC 



181 



2 Make .w.nl.il)I(» t(»( hni( .i) .in<i other intorm.UKMi abnut pf<,fiu< js and servu es so 
sponsors, agenc res, .nMitutions or potential iist^rs nu\ evaluate an<J < omrnent 
ontlK'm ln(Iu(Je representative materials relevant to intended test iiserN) Meet 
re<|uests tor additional information not if^< kideti ,n |>ul)li< atfons within a reason- 
.ible time and, U necessary tor a reasonabl,» tee so loni- dis, losure ,s < ons.sK^ni 
with le^al, f IS, and sponsor polu y and < on^rac ttjal re(juireni(»nts 

i Provide; rmaiton lo persons who t ike ns tests suhmii data tor use by ns or 
partr< ipate rn ns research and develof)m(^nt pro)e< is s(, thev aiII know' 

• th(^ sponsor's ideniiiy and resp()nsil)il)t\ 

• the nature of the a( tiviiy or pr<)je( t 

• Ihe probable us(» of the f)rodu( t, serv« e, or resear< h an(} 

• ibe address to which comments questtons or < nti< isms can be submitied 

4 ^^'re<t to le^alro(inseU,griitK ant pr<)p<)sed new or sijl)siantially revised a<ti\ities 
tor r(.view for ( ompbance with federal statutes regulation case law or state 
law, as appropriate 

"> Seek advice on program policies and f)lans, wher(» appropriate, from qualified 
persons of diverse backgrounds, .nlerests, anci experience (eg, professional 
disciplines, philosophies, geographic regions^ ma(or subj^roups, relevant popula- 
tions of mlcrest) who ar(» not regularly employee! I>v iv, Intomi these individuals 
about the results of their work within a reasonable perrod of time 

Review publications and other mater.als to eliminate language or material 
generally regarded as sexist, racist, or otherwise of tensive or inappropriate 

7 Record, process, and report fir^.an( lal information a< < urately and in a( (ordance 
with j»enerally .ircepied accounting principles 

8 Monitor changes in federal statutes, regulations, and < ase law to asscjre that tis 
activities and operations are in compliance Complian< e with other statutes, 
regulations or case law will be evaluatecJ as apj)ropruue 

9 Provide reasonable a( commodations with respe< t to professional responsibili- 
ties lo ptvmil staff members to attend f)rofess,(,na! meetings, to ( ont ribute to the 
development of professional standards or c o(Jes to engage in activities of 
professional interest, and to stay al>r(Msi of currer.t developments ir- related 



fields 

10 Publish an annual report that provuies informalioji about organizational m tivi- 
ties and frnanc 



ERIC 



1, 



•J 



182 

CONFIDENTIALITY OF DATA 



Principle 



rvLO^nizes thv n^ht ut indn kIluiIs ^md iii'^titutufn'^ U) prndi \ with rv^.ud tu 
mtormalton \vf}f}livd h} <ind<ilnhit them thiit nun hv storvd iii d<it<i or rc^tMn h 
ttic- hvid h\ / / s iind tho com onuumt rcsfn)fisihilit\ to ^(,h%Utird intonndtion in 
Its files from iii\nithon/.od dis( losLire 



A lis will ask incii\iduals to proxido intormation »ih(jut thfrnsoKos only if it is 
[)otenluiliy useful ti) those indi\ icluals, ts pecussarv to atilitt^te firoccssin^ ot 
datii, or serves the [)ublK interest m improving understanding of huni«in pertor- 
rnonce Insofa .ib [)(;S5.ible, indi\iduals should be -rrned of the purpose tor 
\v»^ich the intorniation is recjuested 

B The right of indi\idtJaii>toprivat\ regarding int(jrniati(»n about them that may i)C 
stored in the data or r<»searth tiles held by tis exiencis l)oth lu f)ro(essed 
intormatioa suth as scores based on test-item r(>sponses and the raw data on 
which the processed mformatu^n is based 

C tis wilt protect the confidentiality ol data suf)pl(ed b\ insiuutions or agencies 
abuul tliernseK es, and so identified, the extent thai sue h < onfidentiality does 
not conflict with tis's obligations to ir dividcjals 

D I Is will not colled or maintain in its data or researti tik'"> an\ c rittcal intorniation 
that in Its judgment i annot l)e protec ted acle<jUvitel\ tmni imfjroper dist losure 

f I Is Will encourage the organizations uitli whuh it w<»rks lo adopt policies and 
proceciviresthat ad(^c,ualeh/ protec t the t untidentialitv ol titedata transferred by 
ffs to those organizations 



1 Inform individuals or ir stitutions to the extent appropnate bej(/re information is 
( o!le( ted, ot the mtormalion s int(>nded use the t onditiuns surrcHJiiding its 
c onhdentialily and releas(\ and the l(>ngth ot time the information will be 
retained 

J Use idirittli<>t^le iiitormation about afi ir^dixidual or iiistituUon only for [njrf)os(»s 
tor wfiich permission has been i;ranted unless "Idition.H ( onsent is obtained 
Release identifiable ir^formation iiom ii^ o.iK v\ith proper consent or [)rior 
agreement, or m a manner th»il assures the ( ontidc nluihtv ot the individual or 
institution 



Policies 



Procedural Guidelines 



ERIC 




183 



3 M.ike provision for imiividuals on ()resonl.iiion ot .idt-qunle Kienlif.c.ilion (o g 
signature .inddnt.i file number), t()aiilhori/e the (iisd<)suro of information .iboul 
themselvos from program (l.ita files to .iny appropriate^ recfjienl. provided that 
disc losure does not violate other [is or sponsor polu -e. or the privacy of other 
individuals a authorization is from a tliird party hv f)rior agreement with the 
individual die individual should be notified when d.M losurc has taken place 

4 Make provision for mdiviciuals or their legal representatives to obtain information 
.ibom themselves from data files hcM at i is Such lekMso of information must be 
consistent with sponsor's policies and b ' allowed only upon the indivdual's 
submission of appropriate identifying .nf<jrmaHon .in(i if necessary payment of 
a reasonable fee 

5 Assure that access to electronic, paper, or other fonm of confidenl.al data .s 
reasonably safeguarcJed, especially when such data may be part of a time- 
sharing network, data bank, or other storage medium involving units outside jrs 

6 Develop clear retention guidelines and f)roceduies toi l.minating information 
from data files in accordance with l ts or sponsor f,olic les o. contractual require- 
ments whenever information on indivi(Juals is maintained 

7 Provide Identifiable data only m a manned consistent with these guidelines unless 
served with a subpoena or other legal fjrocess to provide identifiable informa- 
tion In that event, inform legal counsel in order to make appropriate efforts to 
narrow the subpoena ,r to obtain a court order or other arrangements to 
minimize the dissemination of that nlormation 

8 Inlorm every organiAition with which ns works ot the confidentiality of data 
tr.^nsterred by ITS to that .organization or ( ollected by it on behalf of tfs so that 
the organization can protect the confidentiality of su( h data 



184 

PRODUCT ACCURACY AND 
TIMELINESS 

Principle 

r/7<» .]( ( I of I I'-s f}riiuip,il prodin t'> tind the tiijH'hfU'ss u/f/? uhich ihe* tin' 
nwdo iU •]//.] We i^a^ inipofttVit fun-^ at thv rv'>punsibila\ i < s /j.js umfertiiken u/f/j 
H^spcct to it^ sponsors jnd tht> di\ i^r^v puhhc ft ^('^. (»s 

Policies 

A tis *vijl establish 5.tandarcK()f attur.u v and linifliiu-ss lur uai h print ipal [)nKjui t 

B tis wjII use qualuy ci^nUols that are acle quale to assure thai Us standards o\ 
accuracy and limeltness are met 

C iis will make realism delivery lummitments and reasunahle ettoris to meet 
tfiose commilmenis 

D lis will sacnfiie the timeliness ot the deli\erv <jt intormatuin it the desired 
accuracy ot that information is substantially in cjuestion 

£ tis vmII seek to inform thost* adversely atlet ted if, suf)sequent to us release, 
intormatron fias been found n^'t to meet iis standards of accuracy 

f ns will seek to inform those adversely aifn^ted if ihere is a piobability tfiat tfie'e 
will be sul)stantial departure from i is standards ut timeliness vMth respect U> a 
pnnc ipal pr(7duc t 

Procedural Guidelines 

1 Verify and docj'iient ifiai all pnnc ipal produtt^ ( onturm to spec ifii atujns or 
standards i)efore release by doing as man\ the t{jlluv\in^ as appropnate 

• independently re(om[Kilin^ or visu.:!!. inspedm^ ,)n .jpprtjpriate sample at 
eatfi product, or 

• assessing the reasonableness (jt lomputed in^orm.itu n through reviews b\ 
tec hntcally ( ompetent staff, or 

• rpviewing and f)rooring pnnied mater-al or 

• assuring tidhereiK e to ns or piotessioiial standards tfirough effective [)eer 
review 




185 



J Verity .ind do( ument the ac < urac y ot tiilernicdiatc f)r()(iu( ts wfu'n 

• Iho intormalton (eg, answer kvys. convcrsum parameters, algorithms) is 
uilKal to the [)rjncipal product, or 

• early detection an(J (orrcntion ot errors would taoldate meeting delivery 
schedules of the [)rincipal products 

3 Monitor the accuracy, tmieliness and resf)onsi\eness ot replies to in(,jjries 
through [)priodic audits and other means 

4 Report to a specified i rs statf member all instant es in whic h a f )rodu( t tailed to 
conform to requirements or to standards ot accuracy or fmeline>s Resolve 
discref)ani conditions hefore release of the product unless the cognizant jrs 
ottu er has approved release to benefit the majority ot product users 

5 Correc t any critical information found to be in error alter its release and f)romplly 
distribute corrected information to those adversely attected by the error 

(■> Make [)rovision for indivicicjals to verity scores (jr other intormalion within a 
reasonable time Such requests must l)eaccomf)anied l)^ af)prof)riatc identifying 
I nt or mat ion and, if necessary, a reasonable fee 

7 Lstal>lish schedules or other process control methods to assure the timely 
f )rfKluc lion of each produc t or servic e If it is likel> that a f)rodcic t will l)e late, take 
steps (eg, proper notice to test users) to minimi/(» adverse effect 



186 

RESEARCH AND DEVELOPMENT 



Principle 

A ( onUnuin^ pfoiifiiiu ot rvwdn h litnl (h\vk}f)invnt ( (HhIuk ted in i ixvpluiin c 
with pratr^^f()n,il ^tdiidiUd'i i\ith rv^poct to quiiht\ and vdmai proivdurc^ /s 
/?ect*ss<i/) /() nuiiutiiin the high qudht\ ^lod scu uil utiiit\ ot i ontnbutton<i to 
oducMion uJicf s(>( u't} Thfs nn fudvs bj'^n inquir\ to i(H U'<?s(> uiufvrsdifiding ot 
educational pron^-H''- and human dewkipmcnt puhlu pi'lity, cxaluatiw anil 
applied Rwcanh in response to the nerd"* ot the vduiatn)n<il ( omn}unit\, the 
work pla(.c and i>oi ivt\ at large and re>eari h and de\elu{)ment to tn7f)io\e ii-* 
product'i and \er\ne\ Publication ot the re^ult^ i)t ^ignimant re'ioanh is ot 
benefit tit t is and the protes'non he( ause it permits others to u^e build upon, or 
improve lis work 



Policies 

A lis Will devote appropruue researtfi iMtorls lo iho lollowm^ 

^ Improving measurement and education tfitough tht discuvery and kom ep- 
tual integration ot new principles and under-^ anding This researci^ will [)e 
aimed at extending; knowledge ol measurement prirniples and practices, 
knowledge i)f the learner and learning processes uf let'rnin^ environments 
and educational treatments of educational insUluliuns and (j1 the intei.^.( ling 
factors that influence human development 

• lmf)roving the te( hnital quality and the utilitv ot lis products and services 
Am<jnf; the important isi,ues addressed h\ tins researt h will be f)r{)l)lems ot 
lest develof ment, relia[)ihty and gent rali/al)ilit\ e(]uating sahditv and the 
soundness ot test interf)retation 

• Responding to the measurement .ind edu( .ition.il needs *)l s(j( k'1\ and i. re<a 
ing imprc>\ing, and e\aluating insirumenls s\ stems and f)iogran)s ot servue 
that meet these needs 

• Special [)rob!ems tac ed ny sul)grou[)s in so( iet\ iinoKed with test taking In 
«iddition t Is will en( ourageanaK sis h\ subgroup whiMiever sul)gmuf) interesls- 
are pertinent to the researc h being undertakeii 

B lis will conduct its rrsearc fi under appropriate re\iew [)roci>dures iliat protect 
the lights ot pri\a( \ .ind coiitidentNilitv ot human subjects oi respondents ,ind 
of cooperating institutions 

C \ IS will tollow f)r(K edures to insure tluil [ is lese.rt h is ot high cjualitv 



ERIC 



ICO 



187 



Us robenrchers wiil .idherf {o ,ippropri.i(e protessional and ethicil siand,ircls, 
in( luding those puhlisfuKi in IUIik uI Prim iplvs m i/w C ondm 1 0/ Research will, 
Ihomm Partu ipmrn, A ERA Guuichnvs fvr L'/mmuiiWii Riu c t.ndSc \ Bias m 
rj/matitmi/ Rosoanfi ami Lvaliuition .iiul Lifuuil Suindards of PsyOiulu- 

\ IS Will en( ouMge ihe disscmrmtion of lull <u (uunts ot rcstMrc h in iho usuni 
protession.il ^orums .hkI will provide internal nuMns vshich the results of trs 
reseorth (,in be disseminated 



Procedural Guidelines 

1 Assure the welfare and the right to c ontidentialit v ( .t human sul))ects or respnn- 
dents ,n each project following procedures approNed oy ihe Cornmittw on 
Prior Reuew of Research Procedures .approved ihe Committee include 
obtaining a()proprrate informed < onsent, separating partKipanis names from 
data and other steps relating to c ontidentiality and avoiding any negative 
consec|uences of participation 

1 Report the results of research with a()f)ropriate c are k, partic ipants and institu- 
tions so that the possi()j|,iy of misinterpreialion and misuse are minimi/ed 

I Publrsh or otherwise disseminate the results ot research pro|ects unless a )i/slifi- 
able need to restric t dissemination is identitied l)etore the research begins 

4 Follow review pro( edures for rt'searc h f)roposals and reports that will assure that 
researc h is of high (juality Reviews may int luae the tollowing c onsiderations 

• the rationale for the research, 

• th(» soundness of th(> design, 

• the thoroughness and care of the data collection and analysis, 

• the reasonablc>nes^ of the interpretation, 

• the c lanty ot the (exposition, and 

• the soundness of the pro|ect planning anci manaKemeni 

5 fVovidf- tor a periodic assessment ot research and developjnent priorities to 
assure an adec|uate balance ot resources directed toward 

• improving knowk>dge of measurement, occupations, t-ducational ()rocesses, 
ancJ human development, 

• fPeeting the needs of the educational community and society incluciing 
siit)grcHips of specj.il interest, 

• imprc)^ ing 1 is procluc ts and sc^rvic es and the manner tn whi( h these produc ts 
and services are used, and 

• deuHoping new methodologies (mc hiding eclu( ational psychometric and 
statistical, and technological pro( edures) 



1;. 



188 



6 VVhenovor sex, orhtiu rM\n\, ot o finpukUion g•'^Hlp^ drv perimcnl to tho 
resiMrdi «;tu(lips sHoukl h(» dosignt i to olfovv iif^alvsos by suhgroun 

7 Provide non-ns •'estMa hers with rt s{)nabl(» act ess tiX is-t onlrollfcj nonpropri- 
otary (KiM so lon^ as tho privatv nt indivKiuaK and (jrgani/alions and [iss 
tontractiial ()bli^atu)ns c in ht» pro otted Ciunt ac( oss to data tatditaimg the 
rc-analysi. and ( riiKjut' ol pubfisluM Ms res<Mrc h with tlH> same rt'quirt'n.unts ot 
conhdonliaiity ot individuals and in tituiic)ns ouraijc other ()rganizati(oi> to 
adopt a sjfnjiar po\u \ 



189 



TESTS AND flEASUREMENT— 
TECHNICAL QUALITY OF TESTS 

This M.( t.nn uhu h (ItMls vvith I is t(.^tln^ mUk itU's Is ^U^ uir(i inu, m-^ m subsr. 
lions that ,irr (Jru)H d tu test tje^rlofi nvnl trst .uini.fiistr.u.nn u^Mahikv sc ^\v 
(j.'tinidon cquntirui si or*. intcrpit't.itM.n .huJ\.iIk!iU 

Principle 



Policies 

A f"'VV'i'4f!u>t(MUt.|n|)tcstsinuhKht(H'kn(ANl(.o;^r vk.lK ahihtu's ()rfu-s(»naf 
dura ItMisiKsmcasufcd fu<)( t'tJurcs u,ik,u(.(J ami . ntcria usftJ vmIIIh. .jf^pro- 
fHtah' to the (isf i,,r v\huh tfu' trst is desi^nrd and lliat uill tv unl)ias(.d uul, 
rt'^arti to rt'lcwuit rjiaior population subgroups hnin- Ifvted 

B I )^ uill rstat^hsli standards tor ttM adrmnrMration pun rssi^s that mmimiA. v aria- 
lions in tfst pcrtornun' e due toe irc unistam rs or i (.ui.t.ons not rt'k^ ant to the 
attributt's ht'in^ nu>asu't'ci 

( fis ulil rstablisb lor Its tMsts a hii^h (ir^r^v ot rrliabihtv consistt-nl uith tbr 
rf(]ijircriu'nts and the [>iifpost>s nt th(> trst 

1^ f Is uil! (K'vrlof) sc ak.s tor rfporlm^st orrs m a rational laslnon i onsisit-nt uitfi the 
rt'(]iprtMiu'fUs and the intundud ust>s- ,,t tht' tust 

f f'^ uill f)roud(' ("(juatin^ s\strn-s uben a()propnalr for thr (H-rpftiKUion or 
SI Off stales uith the bii^hfst Icwl ot prci ism.h praUKabl.- 

f I ;^ uill tnakp ..<!a[>lt- to s( nrv Wi ifii(>n|s dat., tor mtcrprctni^ si nn'^ nn f is tt'sis 
that foster a(>pr(»pr)atc us^ ot thosf stoics 

(. Krco^^ni/in^ that test wilidation ,s a rcsponsit^ii.ts o, botli test usrrs ar^d test 
deu'lopers i is ui|| em ourap' and assist test i)s..fs m tfieir wilidati-.n efforts and 
u,ll make available tests that are designed to meet pro!ess,o,ialK acceptable 
stamiards nl validit\ for the prjmar\ purfuisrs ot vm f, tt'st 

M us u.ll adhere to appropriate f)roless,onJ ulards sut h as thos^. published in 

^(.mdnds for idin.WnnJ jnd f'^Mhohuu,)! h^stni^ an(j l'r>nuf>l.'s tor the 
\ M.iti.u) ,ind I sr o/ l\>fsnniu'i \'lr(tir>n /V*., vdurv^ 



ERLC 



7^-668 0-89 -7 



190 

Procedural Guidelines: Test Development 

\ Oiil.iiii siil)st.inu\ (' c ontritnitioris to tfu' Irsl (it^t'KtpMiciU prt k t-ss iruin (jti iIiIh ^l 
nxM .ind U'«»nn'n irukjciiD^; ixtmhis v.ho ,irt> nut dm the ii^ vt.iit .md who 
r('[)H's«'nl (livorsr instituUoi.s, ()()f)uliUuin sulj^fnujK j)t'r^[K'( li\t's <iiuj [MmU-s 
suHuil spct '-illu's l)(HcirTH>nl tlu'ir r<>i<'wint (jiialitu .iti< 'iis ,uul c h.ii.u Irn^la s 

2 '\s( criaiii and (loi urnont a[)[)r(if)riaU' Imi k^ruuiu] inlutnuiluKi u^r t\u h U st Ui 
br dt^v oI()[mhI iru lading 

• th«> ttNl s intcndi'd uso('*^ 

• th«' [XJfHjIatiun that will takt' thr trst iiu kidiiii; anlr< (}Mt('(' fna|<ir suhi^H 'Uf 

• iht* [)r' M tMliircs tollowtni tor th'tioin^ tin- dnni.iin lo ho assossotl a do>» iiplion 
()t the dornatn, arni a di'si ription o\ its rolovaru o in aiUu i[)alod 'osj uso s 

< OoujnuTU intorniation rolalivi' to tlnMi'st t)('in^ do\olo[)od iik kidin^ 

• tin* r.iliorMk' for iho itorii ly[)(Hs) and tost lorinal to l)o iisi'd and vvholhoi .iru 
hai kgruund or prior e\[)orM»n( tailors {v^, .i^c or ujltuMl haikgnnnui of 
intended tt'st takers) atfci 1(^1 iti'rn tyf)(' or tost torrnat soNm tion 

• iho pro( (Hiuft's followed tor ^eiu^ratm^ test k ontent U> reproscoi (he tioniain 
or to link test and joh t ontent, 

• the rationale tor llu»s( orin^ rTietln)d(sl es[MH lalK \Uien |ia]i;ni(Milal pro( esses 
iir<' os«»d 

• tfuMtern rt*sp()ns(> model, t alil)ralitfn etlures, and the nature ot llie sani[)ie 
used to estiriiatt* f)aranu*ters when item resj)()nse tlioou priu ( tkjies aie use(i 
to .iss«'rnl)le the test, 

• the rationale and [>rot eikires tor in,ikin); hr,iiK hin^ dt k isions lui Iciniinaliiif^ 
tiie test and tor storing; th(» test when adapte t.r l)raii( hin^ tesi>> are used 
and 

o the lo^K al oi t^mpirual ar|;ur7i(Mif. sup[)ortin^ niin[)arahilit\ ufn»n Miultif)le 
inelhotls tor [iresenlin^ items or reiortling resp( rises \v i; rei. oukn^ .in^wers 
in li'si hooks, on answer sheets, or uith vUh Ironu do\ u i'S' an' inti'ntJi'd to lie 
ii'>vd <ind inter[)retjiiv(» ^tjuk^lines tor nuJikiple ineilrnds uhere i ( )fTiparahilil \ 
IS not supported 

1 I'u'fMie vsith appr(>[)rtate advu e and review lest development spi'i i(k .ttioos i{)r 
ea( fi te^t th,U iov(^r iho followinf^ 

o ( ontent and Skills a ik\u des(ri[)lion ot what is tD l)e tested ini ludini; 
uheri' <i()propriate, i ritual content to bv iru luded in ( .u h lorin .ind the 
rel<iti\e weight to he i^ivt-n to e.u h [)art ot the dtunaii i th.it is tu be ineasun ii 

• li'st anti Item f ormat itiMii ty[H's to bo used spei, lal reijuiienients fe^;a(dHi^; 
direi tions arid sample itiMns or ti'sts 



191 



• PsMhometrii the intended le^el ot dittK ult\ ot the (est the number ot 
Items requirements reg.udin;; l\u [<\r^vi distribution ot item dittu cjities ivvhen 
u^ing pr(»tesie(l itemst recjuirements regarding the homogeneitv ot items 
within e.iih test of subtest and the torrelation between subtests or tests 
requirements tor equating im luding the i ontent and st.uistu al spec ifK ations 
tor equatmg it(>ms and the testing -ime allotted or suggested 

O SensitK it> requirements tor the im Iuskjh ot riiateriai retUn ting the t ultkira! 
baikgiound and contributions ot ma|or population subgroups and 

• Scoring the procedures tor sc onng espeualK when judgmental proc esses 
are used 

S '\ssurethat time lefju'reriients are t onsistent w ith the test s pcrposeso that time 
IS not a de( isive lac tor m per'ormaoc vUr, the large nia)orit\ ot test takers e\<ept 
tor tests designed tf> measure ratv* o! pertoimatue 

(> Ma\e sub|ec t tnatter and test (fe^elopment spec 'alisis ^ho are familiar with the 
spec itic ations and pcjrpose of the test and with its intended popcjiation review 
the test Items for accuracy content appropriateness suiiabilitv ot language 
dtttu uity and the adevjuac \ with which the domain is sampled 

Review mdividuahtems the test as a whole direc tujns and desc nptiv materials 
to asstire that 

• appropriate tec hnual standards sue h as those cont.vnecJ in lis item writers 
nianuals are met 

• language sMnbols wurds phrases .in<] c ontent that ar« genrMllv regarded as 
sexist rac isi neg.Ut^e toward m.ijor subgroups or otherwise potenuaflv 
otfenst^e are »'ltminate<l e\cef)t wlien judged to be necessary tor ,idequate 
repri>st>ntation ot tfie doniain 

• e{iitcjrial st<indards tor c larit^ accuracy and c onsisteru \ are met 

• ( lear .wui c(.mf)ic'ie direc tions appropriate to the nature ot the test and tlie 
c harac teristu s ot the test takers are proMdc'd 

• tvf>ograpfu tormat teg test t)ooks si reens i ip"s' mcJ test-book lav(ail 
tacilitate tfie task ot test t.ikers and 

• .uttK lent sani[)tc' tjuestions are containetl in f)rogf.un [Hjblic ations to [}v 
reinesentatiw of test mntent item t^fvs and dittu ult\ 

H [ wiluate the pertorniaiK I ol incliMclcial items h\ pielestuig fiilot testing re\iew 
ing the results ol aclmmistering simil.ir iteriis lo a similar population or ( oncJuc t 
ing prelimin.u^ anah sts before scores are reported 

hcncwr there are suftic lent subgn ^jp nu'infK'rs to fiermtt me.uiingful anaksis 
stuck Item performanc e lelative to subgrocjfK wf len < orisidcralion ot the re( om- 
mended useK' of tfie test and tfie c h<ir.u teiistu s {fie intended test taking 
population in ligfii i ' [.nor rese.ifc fi itidu ates (he need lor sue fi studies 



192 



10 f v.iluale the porform.incp of v,uh lest wlilion hy 



13 



• tarryrng out timely .iiid ap[)roprijlf iieni and U'sl anaiysfs ituliiding analyses 
tor reliability, intprcorrelalion of sections or f)aris, and s[)eededness 

• reviewing the ade(]uacy of fit of item resf)()nse inodeK to data when item 
response theory procedures are ibed to dovelof\ sf ore, or equate the lest, and 

• crmp.uing the tests charac leristic «> to its psyf hometnc spet itn alioDs 

1 1 Review lesi content anci test Sf)ecifications penodnally to assure their lontinu 
ing relevance and appropriateness to the don)ain being tested 

12 Review test editions developed n f)rior years and their desi nf)Hons in pul)lu a- 
liuns to assure the continued appropnateness of both ( ootent and langu.ige tor 
the ().esent test-taking po()ulation and the subject matter don)ain 

1 \ Analyze major cbai)ge*> in test specifications to assure that thev are ttjilowed l}y 
a[)f)ropnate considers jon of the impluations for score comparability and to 
determine whether test name changes or other cautions to tes^ liscrs about 
iC)n)parrH)ns with earlier tests are nec essary 



ERIC 




193 



Procedural Guidelines: Test Administration 

1 Provi(ii> ()rospi.c Wv cxanunivs {or, »n sorno pro^r,^!!^ p.uvnis or ^uarcluins 
well) \Mih inform<ui()n in a(!\anci.- ot the tost admiinstrauon about the tollowin^ 
as apf)fopr»at(' 

• the ii'st's intended purpose and what it is designed to measure, typrcal test 
Items, clear directions tor the test and the response method to be used, a 
description of how scores are derived indudin^ tormation oi composite 
scores, strategies tor taking the test ie«, ^uessin^ and paimg), whether the 
test (ontains items not intended to be scored, and the badsground ana 
experience relevant to test performance, 

• iHe program procedures and requirements including test dales test tees, test 
center locations, special testing arrangements tor handicapf)ed persons or 
others, test registration, sc ore reporting, s( ore cancellation by examinees, [ts, 
or the sponsor, and registering compKimts, and 

• lest administration procedures and rec|uirements including those related to 
iclentificaiion and admission to the lest center, materials permitted in or 
excluded trom the testing room, and the consequences ot misconduct 

2 Establish test centers that are convenient, nondisc riminalmg, comfortable, and 
accessible to all individuals including handicaf)ped f)ersons Locate test centers 
in both minofity and majority communities to toster accessibility 

1 Advise test center staft of the need to minimis' distractions and to make 
examinees c omtortable in the testing situation Instruct statt to be sensitive to the 
psychological as well as physical n(>eds ot examinees Direct supervisors to 
consult with or include on the lest center statf. when appropriate, subgroup 
members, and persons knowledgeable about handicapfJing c ondilions 

4 f>rovide lest center statt with a description ot the program, the expected cancii- 
date f)opulation, the (iuties of staff, and the procedures tor 

• receiving, Moring, and distributing test materials to examinees, and returning 
them lo [IS 

• admitting examinees to the test cente*- including ID requirements 

• admintsiering the lest to examinees, including handicappeci individuals 

• using appropriate s(..ning plans and assignments and monitoring the testing 
room to reduce opf)ortunities to obtain s(ores by questionable means, 

• handling ot suspected c heating, misconduct, or emergencies, and 

• reporting irregularities (e g , (tistud)anc es, misHmings cietec ti\ e lest cjuestions 
or niaii>rials, power t<iilures, or misconduct! so that «itier review, appropriate 
ac tion can be taken 



194 



5 Provide to*;l center bl.iff with directions (to Ih> rcMd Aloud hehue the test begins) 
th.it ( Jver the recording of .inswers()n,ins\v(>rsh(»etsur vi.i other (h»vi( es, tuvmii 
♦ it test sec tions and hre.iks, guessing str,itegi»>s, .ind iht' ( onsccjuenc (»s of using 
ufLHithorized aids or engaging in other f{)rms of mis( onduc t 

h L'tili/e eltec t ive and wjuitaWe proc eduros tor f)roventing, identifying and resolv- 
ing sc{)res obt«iincd by questtonahle mcMns 

Tncourage examinet>s to r(>port any irregularitit s so that, after review, .ipf)rof)n- 
ate action can be taken 

8 Undertake Cjuality control ,u tivities (e g , test center ol)servations, soIk itation oi 
suggestions from test administrator;* and examinees, training ot test administra- 
tors) «is necessary for effective and, when apjiropriate, secure test administra- 
tions 

9 Make tests available to handu af)ped individuals through sf)eLial testing arrange- 
ments or spe'iiil test editions, as appropnate 

10 Provide users of locally administered tests with instrut lions at)()ut stancfardi/ed 
conditions for administering and scoring the tests. 



195 



Procedural Guidelines: Test Score Reliability 



1 Provide information to enable test users to judge whether reported test scores, 
including subscores and combinations of scores, are sufficiently reliable for their 
intended use(s). 

2 Document sources of variation (eg, test form, content, population of readers, 
time interval between testing, and other sources of error) over which inferences 
are intended to be made from reported test scores 

3 Estimate the reliability or consistency of reported test scores by method(s; that 
are appropriate to the nature and intended use of the test scores and that take 
into account sources of variance considered significant for test score rnterpreta- 



Document the method(s) used to assess the reliability or consistency of the test 
scores and the rationale for using them, the major sources of variance accounted 
for in the reliability analysis, and the formula(s) used and/or appropriate refer- 



• a reliability coefficient, an overall standard error of measurement, classifica- 
tion consistency, orother equivalent information about theconsistency of the 
test scores, 

• standard errors of measurement or other measures of score consistency for 
score regions within which decisions about individuals are made on the basis 
of test scores; 

• the degree of agreement between independent scorings when judgmental 
processes are used; 

• the ad;u«;ted and unadjusted coefficients if reliability estimates are adjusted 
for restnctions of range; and 

• correlations between short forms of tests, if developed, and the standard form 

6 Document the conditions under which the reliability estimates were obtained, 
including 

• the nature of the population involved, 

• the selection procedures for and the appropnaioness of the analysis sample, 
including the number of observations, means, and standard deviations for the 
analysis sample(s) and any group{s) for which reliability is estimated; 

• the basis for scoring when scores are based on judgments, including selecting, 
and training scorers, and the procedures for allocating papers to scorers and 
adjudicating discrepancies, 

• the time intervals between testings, the rationale for the lime intervals, and the 
order m which the forms were admmisterod if alternate-form or test -ret est 
methods are used. 



tion. 



ences 



5 Document the results of the reliability analysis, including 




196 



♦ speeded ness data, and 

• correlations of reported subscores within the same test or test battery 

7 Whenever there are sufficient subgroup members to permit meaningful analysis, 
study the reliability or consistency of reported scores for ma|or subgroups when 
consideration of the intended useis) of the test and the charactenstics of the 
intended test -taking population, in light of prior research, indicates the need for 
5>uch studies 



197 



Procedural Guidelines: Scale Definition 

1 Esiabhsh scales for reporting scores that are wrll-conslruded ihrouj^houl iheir 
range and in a way thai facilitates meaningful score interj)retation relative to 
intended use(s) of the scores 

2 Establish bcale values to l)e reported that do not encourage finer distinctions 
among test takers than can be sup[)orted hy the precision of the test 

3 Choose the scale values in a manner that avoids confusioji with other scales that 
are widely used by the same population of score recipients 

4 Document the rationale and the methods used to determine score scales 
Account for the following as appropriate 

• If scores derived from diff»'rent tests in a program are to he directly compared, 
take into account in the scaling methods the diflerenc es among groups taking 
the different tests. 

• If the scale is to be normative, consider the prol)able length ot time and the 
extent to which the normative information will be appropriate and useful for 
the intended population 

• If a test or test battery yields multiple scores tor an individual and comparisons 
among scores are encouraged, esta[)Iish scales in a manner that allows 
meaningful comparisons among scores (e g , normatively or against an abso- 
lute standard), or provide data to allow such comparisons, 

• If the scale is to be cJefined with reference to performance standards, classifica- 
tion, or cut scores, document the method and rationale used, and the qualifi- 
cations of any judges 

• If a scale IS used to report composite scores derived from weighting subscores, 
clearly state the rationale and the method for weighting the subscores 

5 Avoid reporting raw scores or percentages of cjuestions answered correctly on a 
test or subtest except under one or more ot the following circumstances* 

• only one edition of the test is to be offered; 

• scores on one edition will not be compared with scores on another, 

• faw scores on all editions are comparable, or 

• raw scores are reported in a context that supports the intended interpreta- 
tion(s) 

6 Report item responses for individuals or groups only in a context that supports 
the intended appropriate interpretation(s) 

7 Redefine an established scale only under compelling circumstances. Provide 
announcements to all score recipients indicating the change and cautioning 
recipients against comparisons with earlier scores If the numerical values are to 
be changed change them substantially to minimize confusion between the old 
and the new scale 



201 



198 



Procedural Guidelines: Equating 

1 Assiiro Lompnr.ibihly ol stori»s lh,ii .uo derived Jroni ditti»rorU e(iilions ot ihe 
wim» lesi ,ind .irc usod lo Lornp.ue indivKiuols or j»r(Hjps 

2 DodinieDl n)Clhods usod lo «uhiovt» toi))[),u.ihilily ituludiDg 

• tho raUonalo for sclLHling llu' DU'thods used. 

• du' LODsislcmy bt'lwccn iho asMJm{)lioiis underlying tin* nu'lhod ai)d d)o 
t irt umsUiDt OS uiuk'r whit h ihv i)U'0)od is apphed (v g . wIu'd lesi editions are 
equaled using common ilems, n)ake ihe duet lions, (oniexl, speedednes*. 
ilcm placemenl, and olher aspei Is ot d)e lesl nearly Ihe sanie as possible tor 
all examinees, when anchor stores are based on a lesi ihal is not repre^kenla- 
live oi ihe Icsls being equaled, n)ake sure Ihe groups of exan)inees used for 
equating are equivalent, or when ilein response nnxlels are used, make sure 
lhai mformalion is presented on the adequacy of fit of the n)odel lo the data). 

• the procedure for linking adequately all editioui. of the test fo*- whuh scores 
should be comparable, and 

• Ihc plans for specially designed studies to tolled data lo achieve comparabil- 
ity if only a limited number of editions are offered to in5»litulional or other users 
who will administer and score the tests 

3. Document the results of the equating expennient intluding 

• the nature of ihc population involved. 

• a description of the analysis samplc(s). including the number of observations, 
means and standard deviations; 

• the time intervals between testings, and 

• other statistics appropriate to the method used (e g . < orrelation between the 
anchor test, if used, and the total test) 

4 Periodically assess the results of methods used to achieve lomparability o\ 
scoies and evaluate the stability of the score stale 



199 



Procedural Guidelines: Score Interpretation 

1 Provide score inierpreiaiion mformalfon for all score recipienls in terms ihM 
facililale appropriate inlerprelalions Provide infoinialion that is appropriate for 
each category of score recipient (e.g. examinee, teacher, college, agency, or 
media) and that minimizes the possibility of misrnterpretation of individual 
scores as well as group results. 

2 Provide each category ot score recipients with appropriate inJomiaiion that 

• concerns the intended use(s) of the test and what it is designed to measure, 

• recommends only those score interpretation* tor which supporting infomia- 
tion IS available, 

• descnbes scale properties that affect score interpretation and use, 

• explains the variability of and limitations on the accuracy of test scores (e g.. 
standard error of measurement, classification eirors), and encourages recipi- 
ents to take such information into account in making decisions oased on 
scores; 

• supports assessments based on individual items or clusters of items whenever 
such uses are suggested, and 

• gives the minimum score{sJ required to pass the test when results are reported 
as pass/fail and examinees have failed the test 

3 Provide score recipients with an appropriate frame of reference for evaluating 
the performance represented by test scores through information based on 
norms studies, carefully selected and defined program statistics, or logical 
analysis When statistical Information is included, the information should be 
adequately labeled and the nature of the group(s) on which the information was 
based should be clearly identified 

4 Document the method(s) (e.g , norms studies, derivation of program statistics, 
cut-score studies) used to develop score interpretation mJormation Provide the 
following types of information, as appropnate 

• the characteristics of the scale and procedures used to maintain it; 

• the method of selecting participants on which data are based, including 
information about representation of relevant major subgroups within the 
defined population, 

• the participation rate of categories of individuals or institutions and their 
characteristics such as the age. sex, or subgroup comf)osiiion ot the group, 
weighting systems or other adjustments made to form the norming sample, 
and whether or not the participants were self-stlectwf, 

• the period in which the data wore v.ollecte(f; 

• appropnate o-^u;) statistics whenever tests are intended to he used to make 
assessments of such groups (eg, classiooms) rather than individuals. 



r 



200 



• mot hods .in<j rot ion ale for a|;gr(»j»ating tost results or dovo! oping com[)OMto 21 
scores, 

• estimates of sanipting orror and possihio offotts nt nnnp.iriKifMiinn 

• comparisons with rol<^v«int datii on vanabli's from olhor sources when posM- 
hle, and » 

• evidence supporiing the < iit sc ores or ( onfigur.d scoring rules when dilterenl 
score interpretations .ire nulomatic.ill^ provided tor examinees scorini: at 
different |X)intson the s^alo. 

5 Revise norms or other score interpretation informotion at suffiiiently frequent 
intervals to assure its continued appropnateness .is n frame ot reference for 
evaluation of performance represented by test scores 

6 Compile descriptive statistics periodically from samples or from the entire 
population to n^onitor the participation and performance of major subgioups 

7 Provide score recipients with information as appropriate to assist them in using 
scores m conjunction with other infomiation, seltinji cut scores, interpreting 
♦cores for major subgroups, conducting local norms studies, and developing 
local interpretive materials. 

8 Avoid developing interpretive information for subgroups unless sufficient data 
are available on each subgroup to make the information meaningful, the infor- 
mation can be accompanied with a carefully described rationale (eg, guidance 
purposes) for using it, and the information can be presented in a way that 
discourages I nc orrect intof|)retation and use 

9 Caution score recipients, when* appropriate, that. 

• scores for different tests offered by a program may not be comparable even 
though the scores are reported on similar scales, 

• inferences that have not been adequately validated (eg, ones based on 
foreign language translations, unttmed tests for handicapped persons, experi- 
mental tests) should be macie * ilh care, 

• stores may no longer be comparable if test content or specifications have 
changed sufficiently; 

• scores earned in previous years may become of limited value due to changes 
in the individual or the meaning of test scores over time, and 

• decisions based on the differences between test scores for an individual (e.g , 
aptitude and achievement; should take into ac< ount the overlap between the 
construits and the retinbility of the st ore different e 




201 



Procedural Guidelines: Test Validity 

1 Provide evidence* relating lo the iiUonded umSM or ihe iiM scows ANMire lhal 
IcMs .ue valrdaied hy procodiires lhai aro most appropnale U) die inionded 
UM»(>) ot die icsl bcores 

• Comenl-relaied evidence generally is based on a desaipuon of (low ihe iej.1 
and lesi nonis were derived from and are reJaled U) die areas of mieresl 

• Cmerion-relaied evidence gc lerally is baswl on MaiisUcal relationships 
between icsl scores and aj. ma ly disimcl performance variables as necessary 
lo evaluate the lesl score's effectiveness 

• Construct-rclatcd evidence generally is based on the logical and empirical 
analysis of processes underlying performance on the test, the relationship 
between test scores and other pertinent variables 

2 Describe how the validity evident e provided is appropr.ate to the intended usds) 
of the test 

3 Document the validation procedures used and the results the analyses 
performed Address the following points, as appropriate 

• the number and qualifications of any experts who made judgmenis. and 
procedures used to arrive at judgments pertinent to the validation effort; 

• the n,aterials surveyed, and the rationale and procedures for defining test 
content; 

• for tests designed to sample |ob functions, the link between job tasks and test 
content and, when specified, the link between job tasks and the knowledge, 
skills, and abilities being tested; 

• the rationale and procccJures for determining criterion relevance, the seler- 
tion procedures for and the composition of the validation sample, the relation- 
ship between predictors and criteria, and factors that aftect the relationship, 
including technical quality of the criteria (e g , their reliability, the elapsed time 
between test administration <ind criterion data collection, and rules for 
combining criteria if several cntena are combined), and 

• when quantitative evidence is reporteti, intormalion relative to its interpreta- 
tion such as associated standard errors of the estimate, adequacy of the 
sample, possible restriction of range of scores on the variables, unadjusted 
coefficients (when statistical adjustments are made), the need for cross valida- 
tion, and other contextual factors 

4 Base validity evidence in a particular situation (eg, institution, department, or 
job study) on (lata from other situations only when it < an be established that the 
particular situation .s from the same population of situations Include in docu- 
mentation information about the similarity ot the grou|)s tested, the curncula, 
the job tasks, or other approj)riate criterion vanables 



202 



5 Undertake new validity studies whenever the test, mode of adininistration, the 23 
characteristics of the intended test -taking population, or ihe performance 
domain sampled is changed substantially. 

6 Whenever there are sufficient subgroup members to permit meaningful analy- 
ses, investigate validity for ma) or subgroups when consideration of the intended 
usc(s) of the test scores and the charactenstKS of the intended test -taking 
popuLition in light of prior research indicates the need for such investigation 

7 Establish test names tfMt imply no more than the validity evidence justifies 

8 Provide inform.ition to users to help them plan, conduct, and interpret validity 
studies. 



20G 



203 

TEST USE 



Principle 

Propor And fAir use of us tests is vssonttAl to the «J< m/ utihty And professional 
ACccptAnceofii'i work 

Policies 

A ns will set forth clearly to all score recipients the principles of proper use of tests 
and interpretation of test results 

B ns will establish procedures by which fair and appropriate test use c^m be 
promoted and misuse can be discouraged or eliminated 

Procedural Guidelines 

1 Provide score recipients (eg, examinees, teachers, colleges, agencies, or the 
media) with adequate descriptions of intended test use(s), caution them about 
making interpretations not supported by validity evidence, and warn them 
against reasonably anticipated misuses 

2 Encourage test users to put test scores in an appropriate perspective (eg., 
augment test scores with other relevant information about the examinee,^ 
provide multiple opportunities to retest or to demonstrate relevant skills by other 
means) 

3 Provide users with opportunities for consultation about test use and with infor- 
mation about reliability, validity, test content, test difficulty, and representative 
research 

4 Advise users that when using test scores differently for members of different 
subgroups (e g . separate sex norms or using racial data m regression equations), 
such uses should be carefully and rationally supported. 

5 Advise users that whenever individuals are assigned to groups on the basis of test 
scores, users should undertake periodic examinations of. 

• pass-fail or cut-score policies. 

• the rationale and methods for making assignments, 

• the performance of individu.ils within their respective groups, where feasible, 
including the collection of empirical evidence to support the assignments, 

• the continued .ippropriateness of assignment criteria, and 

• classification rates across major subgroups 



207 



204 



6, Investigate complaints or allegations of improper score use When a misuse is 
verified advise the sponsor and the user and seek voluntary correction If efforts 
to achieve voluntary correction are not successful consult with the sponsor to 
determine whether to continue services to the misuser. Maintain records of 
complaints and their disposition. 

7 Assure the accuracy of any ns-produted piomotional material concerning tests 
and their intended uses 




205 



PUBLIC INFORMATION 

Principle 

IS dedicated to promotinf; public undersumding of testing, mea^^ 
related educational is<iues by providing prof;rams of public information, 
research and advisory and instruc tional activities 

Policies 

A [TS will promole understanding of ihe purposes and procedures of leshng and the 
proper uses of lesl tnformaiion among examinees, lesi users, and the general 
public, ETS will encourage sponsors lo undertake similar efforts 

B [TS wt!l adhere lo high professional and elhical standards in both the promotion 
and Ihe use of iis products and services and in the dissemtnalion of information 
to examinees, test users, and the genera! public its will encourage sponsors and 
other organisations to do so. 

C us will provide tnstruction and technical assistance m testing, measurement, 
evaluation, and related ,"»reas 

D [TS will disseminate the results of research on testing, measurement, and other 
related educational issues and will make [TS-controIled nonproprietary data 
available to other researchers, further, [ts will encourage other organizations lo 
do the same. 

E ITS will respond promptly and appropriately to requests for advice and technical 
assistance related to progri»ms and services offered by [ts, to purposes and 
procedures for testing, to uses and misuses of test information, and to complaints 
about its services. 

F ITS will collect reference materials relating to tests, measurement, evaluation, 
and related research, and wi!l make its collections avatlable to professional 
groups, organizations, and interested indtviduals. 



Procedural Guidelines 

1 Develop and disseminate publications and other materials to promote proper 
test use, discourage misuse, and improve public understandtng of testing, mea- 
surement, and related educational issues directly and in collaboration with 
sponsors 

2 Convene periodically groups ot test users, measurement specialists, representa- 
tives of professional groups, and other interested parties to examine [ts proce- 
dures and recommend improvements tn them. 



* -- 

|:-- 



206 



3 Provide accurale and appropriate informal ion when marketing [ts products and 27 
services 

4 Provide advice and technical assistance on tests and measurennent for test 
sponsors, users, and other interested groups. 

5 Offer conferences, seminars, workshops, and other forms of training or instruc- 
tion in testing, measurement, and other relevant areas of interest, acting inde- 
pendently or in cooperation with other institutions or professional groups. 



ERIC 



210 



207 

GLOSSARY OF TERMS 



Absolute Standard A culscore or performance standard that is established 
without reference to the score distribution of the people for whom the standard 
will be operational For example, a passing score set at 80 percent of the 
questions correct without basing the decision on how many people will score 
above or below that point is an absolute standard. See Cutscore. Performance 
Standard Compare Relative Standard 

/^ccuMcy; The extent to which a pnncip.il product conforms to its specifications. 

Achievement Test. A test that measures a particularbody of knowledge or set of 
skills and that »s ordinanly used to assess a person's level of performance after the 
person has participated in some learnin^^ experience, the outcome of which the 
test is intended to measure Compare Aptitude Test 

Adaptive Test' Aiesi administered such that the next item to be administered to 
a person depends on the person's response to a previous item or s<?t of items. 

Adjusted Coefficient. A statistic that has been revised to estimate its value under 
conditions other than those in the sample on which it has been calculated. For 
example, a correlation coefficient may be adjusted to account for restriction of 
range. See Restriction of Range 

Alternate form- An edition of a test that is written to meet the same specifica- 
tions and Is comparable in most respects to another edition of the test except 
that some or all of the questions are different. An alternate form may or may not 
be a parallel form. Compare Paraffef Form See Test Specifications 

Alternate Form Reliability An estimate of reliability based on the correlation 
between alternate forms of a test administered to the samegroup of people. See 
Alternate Form, Reliability Compare Internal Consistency Reliability, Test-Retest 
Reliability. 

Analysis Sample. The group of people on whose performance a statistic or set of 
statistics has been calculated 

Anchor Tesf A usually relatively short test administered with tv. or more forms 
of a test for the purpose of equating those forms. See Common Items, Equating. 
Answer Key A listing of the correct responses to a set of test questions 

Aptitude Test: A lest that is usually not closely related to a specific curriculum 
and which is used primarily to predict future performance. Compare Achieve- 
ment Test Note that the distinction between aptitude tests and achievement 
tests IS not strong and depends more on differences in test use than on differ- 
enccs in test content. 



2.11 



208 



Attributes Qualities or characteristics of a person, such as ( ommand of a body 29 
of knowledge, ability to perform certain skills, or interest in f)orforming a partic- 
ular type of task 

BrAnchtng Test See Adiptive Test 

Ch'iSSification Error (DThe proportion of inconsistent categorizations of examin- 
ees that would be made on repeated administrations of the same test or of a test 
and an alternate form, assuming no changes in the examinees' true performance 
levels (2) The assignment of an examinee to the wrong category, such as passing 
a person who lacks minimal competence and should fail 

Classification Rates. The proportions of examinees placed in various categories, 
such as pass-fail, on the basis of test scores 

Client (See Sponsor) 

Committee on Prior Re\'iew An ets institutional review board that reviews 
proposed and ongoing research to ensure adequate protection of human sub- 
jects 

Common Items A set of test questions that remain the same in two or more 
forms of a test for purposes of equating The common items may be dispersed 
among the items in the forms to be equated or kept together as an anchor test 
Compare Anchor Tost See Equating 

Comparable Scores Scores that are put on the same scale so that they have the 
same meaning m terms of relative ranking within a defined group of people but 
that cannot necessarily be used interchangeably For example, percentile rank 
scores on a reading test and on a math test are comparable scores if the 
percentile ranks have been based on the same norm group for both tests 
Compare Equivalent Scores 

Composite Score A score that is the combination of two or more scores by some 
specified formula 

Configural Rule. A specified procedure for interpreting the pattern of a person's 
scores on two or more tests or subtests 

Consent Permission granted by an individual or that individual's parent or 
guardian for the use or release of data held by ETS, such permission granted upon 
receipt of a reasonable explanation of the purpose of the use or release and a 
reasonable explanation of the manner in which the results will be reported 

Construct A theoretical concept developed to explain a group of related 
behaviors. Examples of constructs are "intelligence," "creativity," "self concept," 
"anxiety " 

Conversion Parameters Quantitative rules for expressing scores on one test form 
in terms of scores on an alternate form See Alternate Form, Equating. 

Criterion- (1) That which is predicted by a test, such as college grade-point 
average or |ob-performanco rating. (2) The score with which responses to a test 
Item are correlated 



212 



209 



Criterion Relevance' The extent to which the measure used in assessing a test's 
predictive validity is related to the test's intended purpose 

Cnttcjl Content: Knowledges, skills, or abilities th.it must be measured ma test 
because of their importance 

Critical Information. Information that will bo used to draw important inferences 
(a) about the sponsor, [is-appomtedexternal committees, institutional or agency 
user, examinee, subject or respondent, or (b) by the sponsor, institutional or 
agency user, examinee, subject or respondent and which, if incorrect, could be 
harmful. 

Cross Validation- The application of scoring weights or prediction equations 
denved from one sample to a different sample to allow estimation of the extent 
to which chance factors determined the weights or equations or inflated the 
validity estimated in the analysis sample. 

Cutscore: A point on a score scale al or above which examinees are classified in 
one way and below which they are classified in a different way For example, if a 
ctitscore is set at 60, then people who score 60 and above may be classified as 
"passing" and people who score 59 and below classified as "failing" 

Dom.vn:A defined body of knowledge, skills, abilities, attitudes, interests or other 
characteristics. 

Equating A statistical process used to convert scores on two or more alternate 
forms of a test to a common scale such that the scores may be used interchange- 
ably. See Anchor Test, Common Items, Conversion Parameter. 

Equivalent Scores, Test scores that can be used interchangeably Compare 
Comparable Scores. 

as Board of Trustees. The tis Board of Trustees is the governi ng body of ets There 
are 17 trustees. Sixteen are elected for four-year terms. New members of the 
Board are elected by current trustees. The President of fisis an ex ofTicio member 

US-Held Program Data Files Information about individuals and institutions held 
by CTS and derived from [is-provided services of collection, processing, storage, 
retrieval and dissemination 

as-Held Research Files Information held by us and generated through tTS-con- 
ducted research. 

Examinee An individual who takes a test, developed and/or administered by ets. 

Formula Score Raw score on a multiple choice test after a correction for guessing 
has been applied, usually the number right minus a fraction of the number 
wrong. See Raw Score 

Handicapping Conditions. (1) A visual, auditory, other physical or learning dis- 
ability such that a test administered under standardized conditions would result 
in a score that significantly underestimates the person's true ability. (2) A 
disability which limits a person's access to a testing site See Standardized 
Conditions 



210 



Instttuthn,!/ or Agency User An organi/alional recipicnl of fis-proccsscd or 
produced information 

ImermediMc Product Materials thai are not released exlernally, but lhal are 
necessary lo ihe production of xUc principal product 

tnterncil U ^^ ocy ReliMity An estimate of reliability based on the extent to 

whicbtr.ei:cw.jhatP^ttendtomeasurethesarnr3ttributeinthesarneway See 
keliabihty Compare AUei njto torm Rehabmy 7*-^ P^^-t^^^ f^cfuibihty 

'.' test question * *' 

Item Analysis A statistical description of how ?n Kem performed within a 
parlrcular test when administered to a particular sample peop.. • P^a often 
provided are the difficulty of the question, the numoer of people choo> ; ach 
of the options, and-the correlation of the item with some criterion 

Item Response (i) A person's answer to a question (2) The answer to a question 
coded into categories such as right, wrong, or omit 

Item Response Theory A set of propositions relating people's performance on 
test questions to certain characteristics of the peopleand certain characteristics 
of the Items by means of mathematical models It is based on the assumption 
that the probability of a correct response by a person to an item can be 
calculated from the examinee's estimated ability and certain statistical charac- 
teristics of the Item 

Item Type The observable format of a test question. At a very general level "rtem 
type" may refer, for example, to multiple choice or free response questions At a 
finer level of distinction, ",iem type" may refer, for example, lo synonym 
questions or antonym questions 

Locil norms A distribution of scores and related statistics within an institution or 
closely related group of institutions (such as the schools in one district) used to 
give additional meaning to test scores by serving as a basis for comparison. 

Locally Administered Test A test that is given by an institution at a time of the 
institution's own choosing 

Normative Scale A way of eApressm^ a score's relative standing m the distribu^ 
lion of scores of some specified group 

Parallel Forim Alternate forms of a test that yield nearly identical means and 
standard deviations of scores as well as nearly identical correlations between 
scores and other variables See Alternate forms. 

Parameter. (1) The value of some variable for a population as opposed to an 
estimate of the value based on a sample drawn from the population (2) In item 
response theory, one of the characteristics of an item such as its difficulty 

Part Score- A score derived from a subset of the items in a test Synonym of 
Siihscore 

Performance Standard A cutscore or a defined level of performance on some 
task For example, "Run 100 yards m 12 seconds or le.ss " See Cutscore. 



211 



Not Testing' Smnll scale try-out of test (|uestions or n test form often involving 
observation of nnd interviews with examinees 

Precision, The width of the interval within which a valuo can be estimated to lie 
with a given probability The highor the precision, the smaller the inlen/al 
required to include the valuo at any gtvon probal)ilily 

Principal Product- ns-produced or processed materials (eg, annual reports, 
performance data, score reports and admissions tickets) that are released or 
transmitted to a sponsor, [Ts-appointed external committee, institutional or 
agency user, examincH?, subject or respondent, pursuant to a contract or pub- 
lished commitment. 

Principles For The Validation AndUso Of Personnel Selection Proaxftrres, Divi- 
sion of Industrial-Organizational Psychology, American Psychological Associa- 
tion, Berkeley, CA; The Industrial-Organizational Psychologist, 1980. 

Program Stati<:tics Data that are based on the groups of people that happen to 
take the tests offered by a particular testing program Program statistics are not 
equivalent to data derived from carefully selected samples of defined popula- 
tions such as those used to construct national norms 

Raw Score: (1) The number of items answered correctly on a test with no 
adjustment (2) In some usages, the formula score is also called a raw score See 
Formula Score 

Regression Equation A formula used to estimate the value of a variable given the 
value of one or more observed variables. For example, estimating college grade 
point average given high school grade point average and SAT scores 

Relative Standard A cutscore or performance standard that is established with 
reference to the score distribution of the people for whom the standard will be 
operational. For example, a cutscore set to pass 60 percent of the people is a 
relative standard See Cutscore. Compare Absolute Standard. 

Reliability: An indicator of the extent to which test scores will be consistent 
across different conditions of administration and/or administration of alternate 
forms of the test See Alternate Form Reliability, Test-Retest Reliability 

Respondent' An individual who provides data to a research project in a manner 
and for a purpose different from either examinees or subjects. 

Response Methodlhe procedure used by an examinee to indicate an answer to 
a question such as a mark on an answer sheet, a handwritten essay, or an entry 
in an electronic storage medium 

Restnction ofRango. A case in which the variance of scores in an analysis sample 
IS lower than the variance of scores in the population from which the sample was 
selected. See Analysis Sample, Variance 

Sampling Error The difference between a statistic (lenved from a particular 
sample and its value in the population from which the sample was drawn. See 
Parameter (1) 



212 



Score, A quaniilntive or taiogoncil value (such iis "pass or tail^') assigned to an 33 
examinee as the result oi some measurement procedure 

Score Recipient: A person or msliliilion ohiamini; the scores ot individual 
examinees or summary data for groups of examinees 

Score Sctile. The set of numhers within which scores are re()ortod for a particular 
lest or testing program, often, but not necessanly, having a specified mean and 
standard deviation for some defined reference group 

Special Testing Arrangement. A test administered under non-standardized con 
ditions in which modjfications have been made to nioet the needs ot exannnees 
who require the modifications for appropnate assessment such as providing 
<iudio-taped versions of tests for visually -impaired people See Sttindardi/fxf 
Conditions 

Speedednes*^ The extent to which peoples' scores are aflected by how quickly 
they respond to items on a test One indicator of speededness is the percent of 
lest takers who answer all of the items in the test 

Sponsor Educational, professional or occupational associations, federal, state or 
local agencies, public or pnvate foundations which contract with [Ts tor its 
services This category includes their governing boards, membership and 
appointed committees or staff 

Standard Deviation A statistic characterizing the magnitude of the differences 
among a set of measurements Specifically it ts the square root of the average 
squared difference between each measurement and the mean of the measure- 
ments See Viinance The standard deviation is the square root of the vanance. 

Standard Error of Estimate A statistic that indicates the standard devi.ition of 
differences between actual and estimated measures It is an indic.itor of the 
accuracy of the estimate. See Standard Deviation 

Standard Error of Measurement A statistic that indicates the standard deviation 
of the differences between observed scores and their corresponding true score 
It has also been descnbed as the standard deviation of scores for a person taking 
a large number of parallel forms of a test, assuming no changes m the person's 
true ability. See True Score, Standard Deviation 

Standardized Conditions The administration of a test in the same manner to all 
examinees to allow fair comparison of their scores Factors such as timing, 
directions, use of aids such as calculators and dictionanes are controlled to be 
constant for all examinees 

Standards for Educational and Psychological Test'i, American Psychological 
Association (APA), American Educational Research Association (AERA), anci 
National Council on Measurement in Education (NCME) Washington, D C APA, 
1985. 

Subgroup A part of the larger population which is definable according to various 
cntena as appropriate, (e.g, by sex, race or ethnic ongin, training or forma! 
preparation, geographic location, income level, handicap and/or age) 



Er|c 216 



213 



Sub/ea: An indivKlual who participates in an us laboratory or expenir.ental 
research project. 

Subscore: A score derived from a subset of the iten)s in a test. Synonymous with 
parf score. 

Subtest: A subset of the Items in a test upon which a iubscore or part score is 
based. 

Test Analysis: A description of the statistical charactenstics of a test following 
administration, irxJuding but rxH limited to distnbutions of item difficulty and 
discrimination indices, score distributions, mean and standard deviation of 
scores, reliability, standard em>r of measurement, and indices of speededness. 

Test Battery: (1) A collection of measures designed to allow the companson of 
scores across measures for an irxlividual. (2) Loosely speaking, a collection of 
tests often administered together. 

Test Form: A unique edition of a test consistirig of all of the identical copies of a 
test. Compare Akemate Form, Parallel Form. 

Test FomTat: The physical layout of a test including the spacing of items on a 
page, type size, positioning of item response options, etc. 

resf/n^/Vqgram- A set of arrangements under which examinees are scheduled to 
take a test under standardized conditions, the tests are supplied with instructions 
for giving and taking them, and arrarigemenls are made for scoring the tests, 
reporting the scores, and providirig interpretive information as part of a compre- 
hensive ongoing service. A program is characterized by its continuing character 
and by the inclusiveriess of the services provided. 

Test'Retest Reliability. An estimate of reliabdity based on the correlation 
between scores on two administrations of the same test to the same group of 
people. See Reliability. Compare Akemate-Fdrm Reliability. 

Test Spectfications: Detailed documenution of the intended charactenstics of a 
test including but not limited to the content and skills to be measured, the 
number and type of items, the level of difficulty, the timing, and the layout. 

Test' Taking Population (Intende^ihlhe people for whom a test has been designed 
to be most appropriate. The aciual test taking population may differ in some 
instances from the intended population. 

r/me//ne55: The degree to which a principal product is released or delivered to its 
recipient within a prede6ned schedi le. 

Tnje Score: The hypothetical avercige score of an examinee calculated from an 
infinite number of administrations of equivalent test forms assuming no learning, 
forgetting, or fatigue on the part of theeiaminee. It is the score that an examinee 
would obtain if the test were perfectly reliable and the standard error of 
measurerDent were zero. See Reliability, Standard Bnor of Measurement 

Validity. The extent to which inferences n»de on the basis of test scores are 
appropriate and justified by evidence. 




214 



Variance: A statistic charactenzingthe magnitude of the differences among a set 
of measurements. Specifically it is the average squared difference between each 
measurement and the mean of the measurements. 

Weighting System: (1) A formula giving the relative contribution (expressed as a 
multiplier) of part scores to a composite score See Composite Score, (2) The 
relative contribution assigned to certain sample data to better represent a target 
population 



218 



I 

} 

I 
i 



215 



The ETS Sensitivity 
Review Process: 
Guidelines and Procedures 



216 



ACKNOWLEDGMENT 



TTiis document is a revision of ihc onj^nal ETS Trsi SmsiiiMi i ^ wch PnHfts that wa^ dc\ doped in 
1980 by Ronald V Hunter and Carole D Slaughter 

Substantial contnbutions havt been made by other wnlers of earlier documents dealing with the is&ue 
ofsensiiiviiy Many of these pioneering efforts, such as Ihe ETSGutdclmrs for Testing .\tmonttes, the LTS 
GuMnrs/orSfx Fatmrssin Tests and Testing Programs, and Ihc Guidelines for A\otdmg Se\ist iMn- 
guage,* provided much of the creative thought and detail coniained within this document 

Finally, many ETS staff members have Uken the time to reucw drafts of this document, in so doing 
they hi\x provided a %-calih of hdpful suggestions and productive insights on this complex issue. 



•From McGraw Hill. Guideline* for Eqval 7>eaimeni of the Sexes, 1974 Used with the permission of McGra* HiU 
Book Company Recently reissued in Gwdelmes for Bios' Free Puhlishmff 



Copyright 0 1986 by Educational Testing ScrvKc 
All nghu reserved 

Educaiiofial Tesiing Service ETS. and are registered 
trademarks of Educational Testing ScrvKC 
Educational Testing Service is an equal opporiuruty; 
affifmative action employer 




217 



Table of Contents 



Introduction ^ 

Process Otenrlew ^ 

Reviewers * ^ 

Sensitivity Review for Tests 3 

Sensitivity Review for Other PublicaUons "...............[.. 4 

Procedures for ScoslthityRcvieiis of Tests 4 

Preliminary Review 4 

Final Review 4 

Arbitration Process ^ 

Procedures for ScosHiTity Reviews of PaUkatioiis 7 

Mandatory Review 7 

Arbitration Process for Publications 7 

Evaluatioo GuMeliDes g 

Definitions g 

Evaliiatioa Reqalmneiits H 

Cognitive/Aficctive 1 ] 

Controversial Material 1 4 

Examinee Perspective " 14 

Balance 14 

Stereotyping 15 

Caution Words and Phrases 15 

Special Review Cnlcria for Womcn*s Concerns 15 

Special Review Criteria for References to People with Disabilities 15 

Underlying Assumptions ' ' ' " |^ 

Context Considerations 16 

Elitism, Ethnocentndty, and Related Problems ............[ 16 

Appendix A; Guidelines for Recognition of Unacceptable Stereotypes 17 

Appendix R Caution Words and Phrases 19 

Appendix C; Special Review Criteria for Women's Concerns 22 

Appendix D: Special Review Criteria for References to People with Disabilities 25 

Appendix E: Sample Forms 27 

Appendix F; ScnsiUvity-Relatcd Sections of ETS OfTidal Documents 31 



218 



ETS SENSITIVITY REVIEW PROCESS: 
GUIDELINES AND PROCEDURES 



INTRODUCTION 



Educational Testing Service is committed to cnsunng th;>t its tests and publications acknowledge the 
multicultural and multiethnic nature of our soaety and reflect a thoughtful and fair consideration of the 
very broad character of ETS's clientele As part of the effort to attain this goal. ETS has stated in its 
Standards/or Quality and Fairness that individual test questions, tests as a whole, and descnptue materials 
must not contain language, symbols, words, phrases, and examples that are generally regarded as sexist, 
racist, or otherwise potentially ofTcnsive. inappropnate, ornegative toward any group ' 

This document is the basic guide to the process through which these standards are met. It identifies the 
sensitivity cntena used in the reviews and deUils all review procedures Although most of the cntena are 
general ones that can and should be applied to any population group, expenencc has shown that a special 
effort must be made to evaluate material from the perspectives of Asian/Paafic Island Americans. Black 
Amencans. Hispanic Amencans. individuals with disabthties. Native Amencans/Amencan Indians, and 
women This publication, therefore, specifically addresses areas of speaal concern to these six groups 



PROCESS OVERVIEW 



Reviewers 

The reviewers for sensitivity evaluations are trained in two-day workshops that coverall issues: in 
addition, there are one-day refresher workshops for penodic review of sensitivity issues. Trained staff mem- 
bers from test development and test editing areas represent the general disciplines of the humanities, the 
soaal sciences, the sciences, and vocational education. Trained editonal staff also serve as sensitivity review- 
ers for nontest publications While women and minonty staff membere are represented among the reviewen. 
any professional volunteer can be trained to perform sensitivity reviews Before formally reviewing test 
material or other ETS publications, all revle^ktrs receive training in the ETS sensitivity guidelines and the 
process in order to ensure that they understand the review cntena and are able to apply them consistently 

Sensitivity Review for Tests 

The test sensitivity review process has three major components' an optional preliminary review, a 
mandatory final review, and an arbitration process Every pretest and final form (scored test) must have a 
sensitivity review, and every test more than five years old must ha\x one before repnnting 

Prtlimlnary Rokw (optional) Any staff member in the process of assembling a test may request a prelimi- 
nary review to screen questions, reading passages, and other such materials for possible problems and 
deficiencies The revic^tr's recommendations are not binding at this stage However, this review may reveal 
problems at a point early in the test development prxcss when modifications can be made more easily 

Final Review The mandatory final review takes place at the t«me of the editing process After editing, 
substantiv e changes are not normally made in a test This final sensitivity review must be conducted, even if 
the test received a preliminary review If possible, the prcliminar>' and final reviews should be performed by 
the same person 



' Sec Appendix F 



O O O 
»v i4* 



219 



^.t^i^^H^** ^ "^^^ the test and the smaimty reviewer cannot agree on how 

t^Zl "t^' "^"^ ^ ^""^'^ «>0"*'"^»«^ f^om 

parties ,ncet with the test dcvdopmcn d.r«tor from the test assembler's air. to discuss a possibJe rcsolu- 
tion If mediauon .$ unsuccessful at this stage, the material u. qucsuon goes to arbitiaUon 

An arbitrauon panel consists of thi« staff membe:, not m test development divwons. TTiese arbiters 
the same ^«;^»m the ETS .xns.tivity guKJefcnes asdo the revKwTArbiten may nm^™ 
dcasjon.makmg pand tnvolvujg a ;>rogram for whkA they »wk 

•hK „^!l!l!*r"^*' P'-^' *^ ^ol-^" *c Pu^fcl-n^- As part of 

toSr^Tt^jfrllSt^"'^ 

The deasions of the arbitration pand arc final and binding. 

Sensitivity Review for Other Publications 

SensiUviiy reviews for oontest publications are conducted as part of the w>nnal editorial review process 
Ordinarily, sensjtivity mues are resolved between the reviewer and the auth^T^^ 7^^'!^.^^ 
dispute IS resolved through the same arbitration process used for test matenal <»'«i5««ncnt. the 



PROCEDURES FOR SENSITIVITY 
REVIEWS OF TESTS 



Preliminary Review 

Dunng the optional prdiminary review, test items, reading passages, and other sudi malcnals can be 
^reenccl to de^ potential problems. TTk preliminary review ^'Scd at the r^^^f t^LTj^^. 
^^lo^lng '^^^^^ the test work folder, whid. contains several do^ts. incluifngr 

L AcopyofthctestspeoTicatioos 

2. The test items (usually unassemble d ) 

3. Any other relevant material 

4. The test sensitivily review report form 

Tlic sensiuviiy reviewer the work folder and report form (see Appendix E) with comments and 

h r '° ..-mbler within 48 hourr. r«ne«diai^ to the proje«/job fT^T 
Although the rev«wer-s recommendations are not binding, failure to modify thctofmaterulrnS^t 
jt^ull m simiUrrcoommendatioM during the fmd rt^^ 

degree dunng the editing process (final mandatory review) can caine ddayi in the o waU S^^in^.ni 

Final Review 

Tlic mandatonr final review takes place during the tcst-cdilmg process.^ |f the test has received a ofdimi. 

fonn (i/- A. ffumdaforyrer^rrvfobnoprol^ to indHate that the mandatory rev^w has been per- 
formed It IS recommended that the prdiminary and final reviews be performed hy the same poTn^ test 

^oTw&^rL^^^ 

I Tlic tot assembler fill, out the top portion of the front pa«e of the test sensUvity review report 
fonn It IS important to indicate at this time the exact nature of both the final form requirements 
ana pretest requirements for multicultural matcrut m the test 

fn? *» f*rt of the editiflt prooes for some mathematics and saence tests 



9 0 ) 

^ O 



220 



2 The lesi assembler submits the ent rc test work folder lo (he scnsmvji> review rouier m h«s or her 
division for assignment lo a sensituiiy rcvicuer 

3 The sensiliviiy rcuew router logs me work folder and assigns it lo a reviewer The router may give 
the test to a reviewer from anothei division 

4 The sensitivity reviewer evaluates ic test in accordance wilh the Gwdehws to determine eonfor- 
mity.^ 

5 The sensitivity reviewer completes .he test sensitivity review report form, by which sensitivity com- 
ments and recommendations arc documented, and returns n to the router along with ihe work 
folder within 48 hours If no recommendations arc made, the sensitivity reviewer indicates accep- 
tance on Ihe test assembler's control sheet and the test sensitivity review form and returns ihcm to 
the router along with the work folder. Time is charged lo ihe projcct/job for ihe test. 

6 The test assembler discusses the report with the sensitivity reviewer as necessary. If ihe sensitivity 
rcvicuer has made no recommendations, the assembler signs and dates the report form and files il 
in the work folder The test or test sccUon is then sent through ihc -jsual test production cycle. If 
the sensitivity reviewer has made recommendations, the test assembler provides a wnttcn response 
to each issue, outlining planned action and. where appropriate, a rationale, and returns these 
responses to the sensitivity reviewer along with ihe work folder 

7 The sensitivity reviewer indicates concurrence or nonconcurrcnce with the test assembler's responses 
and rctums the form and work folder within 48 hours of receipt to the test assembler 

If the sensitivity reviewer » satisfied with all of the responses, the report form is signed and 
dated. Ihe control sheet is signed, and all documents are retumcd to the test assembler along with 
the work folder The time is charged to ihe projcct/job for ihe test. 

A sensiiivit> reviewer who disagrees with the test assembler's planned actions will meet first 
with the test assembler and the sensitivity review coordinator from the lest assembler's area. If no 
resolution occurs, the assembler, ihc reviewer, and the sensitivity review area coordinator from the 
lest assembler's area meet with the test development director of the test assembler's area to attempt 
to resolve the is$uc<s) 

The test development director serves as a mediator and attempts (o resolve the issues to the 
mutual sausfaction of both the sensitivity reviewer and the test assembler.'' If the problem is resolved 
at this time, one of two processes takes place 

a The sensitivity reviewer indicates concurrence with the test assembler's rationale, and both the 
Sv-nsitivity reviewer and test assembler sign and date the report form, indicating the test i$ accept- 
able to both sensitivity reviewer and test assembler The sensitivity reviewer also signs the control 
sheet and charges the ume to the projcct/job for the test, 
b The test assembler makes the agreed-upon changes, indicating what revisions have been made, 
and forwards the report form and work folder lo the sensitivii> reviewer The sensitivity reviewer 
signs and dates the report form, indicating the test is acceptable as revised, signs the control 
sheet, and rctums both to the test assembler along with ihc work folder within 48 hours of 
receipt. Time is charged lo the projcct/job for the test 
In cases where there js no resolution, the sensitivity reviewer will record on the report form all 
of his or her disagreements with the test assembler's responses The report form and the work folder 
are submitted to the test scnsiuvity review coordinator from the test assembler's area The coordina- 
tor submits the material for binding arbitration. In recording his or her position, the sensitivity 
reviewer should make specific references to relevant sections and pages in the GuMnes 

Both the test asscmWer and the test sensitivity reviewer write memoranda of explanation to the 
arbitration panel 

The lest sensitivity review area coordinator requests that the test sensitivity review stecnng 

committee chairperson form an arbitration panel 

All matertals go through the test sensitivity review area coordinator to the arbitration panel 
Thearbitration panel gives jts dcasion to the coordinator, who notifies the involved parties 
8 The Test File Library retains the final test sensitivity review report form and the arbiters' decision as 

permanent components of ihe work folder 

* The test KMiiivity review report form muj: not be used for comments or juggwiionj oibcr than sensitivity concerns 
Reviewers arc encouraged lo make juch comments but to wnie them on a separate $hcel of paper 

* Ai any point in the process, the senjitmiy reviewer may consult with his or her tcit sensitivity area coordinator or 
any other available area coordinator 



5 



224 



221 



Arbitration Process 

As soon as It IS recognized that arbitration will be required, the test sensitivity review area coordina- 
tor should notify the chairperson of the sensitivity review stcenng committee It is important that this be 
done quickly so that an arbitration panel, consisting of three of the five trained arbiters, can be assembled 
as soon as possible. The chair of the sensitivity review stccnng committee appoints a chair for the panel 
and arranges for a meeting room The panel's dcasion is due within one week Time is charged to the 
projcct/job of the test under consideration 

Since the mandatory sensitivity review occurs dunng the test editing process, other cditonal changes 
in test malenal can be made while the sensitivity item is in arbitration. Copy editors should NOT sign off 
however, until the control sheet is signed by the test assembler's area director, who notes near the appro- ' 
pnate box; "Decided in Arbitration." 



Further procedural steps are as follow s 
1 



The sensitivity reviewer, test assembler, sensitivity review area coordinator, and test development 
director sign the test sensitivity review arbitration control sheet Signatures indicate awareness that 
the passage/item/test is going to arbitration, not nccessanly agreement with either party. 

2. The work folder and sensitivity review report form are given to the sensitivity review area coordi- 
nator. 

3 All arbitration occurs through wnttcn matenal only. There will be no oral arguments bv either 
party before the panel of arbiters 

a In a memorandum, the sensitivity reviewer must dearly indicate the nature of the problem(s) 
and must ate the section(s) and page numbcr(s) of the guideline(s) being violated. A reviewer's 
inabihty to find speafic references may ii dicate that the objection is inappropnate. 

b In a memorandum, the test assembler must dearly indicate the rca$on(s) for NOT accepting the 
sensitivity reviewer's recommendations (Time constraints will not be considered suOlaent rea- 
son for not changing test n^atenal ) The test assembler's statements should explain how and 
why the test matenal docs NOT violate the guidelines The test assembler's statements should 
be documented, induding text references to the Gutdetuies and copies of test speaficatiops when 
appropnate 

c. Other wntten matenal may be solicited by the panel itself 

The arbiters are famihar with the ETS test sensitivity guiddines and have a copy of the procedures 
available when they meet. The panel can decide one of the following 

1. Psissage/item/test » m violation of the guidelines and the matenal must be changed or dropped. 
2 Passage/item/ test is not m violation of the guiddines. 

3. The guiddincsdo not address and are not rdevant to the problem raised by the reviewer 

In reviewing the passage/item/test, arbiters may discover that another passage/item, not ated by the 
reviewer, violates a guiddine. a is the duty of the arbitration panel to rule on that matenal as well The 
arbitration process is intended to provide a mechanism for resolving disagree .lents between test assembler 
and sensitivity reviewer, however, as the fundamental goal of the sensitivity review process is to diirinate 
offensive matenal. arbiters would be remiss if they were not to rule on any ma'enal brought before them 
that violates the guidehnes 

Once the panel has made us binding decision, the arbitration control shett. the test sensitivity review 
report form, and the test assembler's control sheet are signed and returned to the area sensitivity review 
coordinator. Copies of the arbitration decision and the memoranda wntten by the test assembler and the 
sensitivity reviewer are sent by the assembler's area sensitivity review coordinator to the assembler's area 
director, the slcenng committee, the test assembler, and the sensitivity reviewer. The sensitivity review area 
coordinator and the stcenng committee are also sent copies of the passage or item. The sensitivity review 
coordinator ensures that any necessary changes are implemented 



7A-668 0 - 89 -8 



222 



PROCEDURES FOR SENSITIVITY REVIEWS 
OF PUBLICATIONS 



Mandatory Review 

ScnsiUvity reviews of »ll ETS publicabons oihcr than lols »'c performed by ihc cdjional sialT Fdiiors. 
like lesl reviewers, must go throu^ scnsiUvity training to be qualified lo perform such revie>^-$ If an cdilor 
ofa pubbcaUonistlsothcaulhorofthcmanuscnpUaftotbcreditor performs ihescnsiimiy review Cdilors 
undertake scnsiuvily rrviews when the manuscnpi has reached final draft stage, before it is put inio produc- 
tion However, csdiiors are encouraged to review copy informally as early in the editorial process as possible 
If copy is changed or added to a nunuscnpi already reviewed for scnsiUviiy and in production, the editor 
must review the additions for conformity to the ETS sensiUr.ty guidelines Editors also rcvicvs publications 
produced before Ihc most recent guidelines were issued when such publications are scheduled lo be 
reprinted 

Editors are also responsible for reviewing audiovisual pubhcation^ and artwork proptvicd for induMon 
in publications, using the same procedures described above 

Editorial stalTbnng sensitivity issues m publicabons to the aiicnlioi. of the project director The editor 
then works with the project director (oeluninatc questionable or mappro; nate copy from the publication 
A project director who chooses not lo change the copy, due toconflicis wnn program policies, must reply 
on the publications scnsiUvity review form to the editor's objections If the disagreement continues, the 
scnsiUvity revicwa. the project director, and the publications scnsiUvity review coordinator meet >Mth the 
division director of the project director's area. The division director ser\-es as a mediator and attempts to 
resolve the issue(s) to the mutual satisfaction of tlic senstU%ity reviewer and the project director If the 
problem is not solved, the pubticauons seiuuvity review coordinator noUfie^ the chair of (he steering 
committee, and the dilute goes to arbitrauon as quickly as possible 

Arbitration Process for Publications 

The chair of the sensiuvity review steering committee arranges for an arbitration panel, appoints a 
chair for the panel, and arranges for a meeting room. At this point, the scnsiUvity reviewer, project director 
publications scnsiUvity review coordinator, and sensiuvity review coordinator from the project director's 
area sign the scnsiUvity review arbitraUon conuol sheet. Signatures indicate awareness that the publicauon 
IS going to arbitration, not necessanly agreement with cither party.^ 

In a memorandum, the sensiuvity reviewer must cfcariy indicate the nature of the problcm(s) and must 
ate the section($) and page numher(s) of the guidebne(s) being violated A reviewer's inability to find 
specific references may indicate that the objection is inappropriate. 

The project director must dearly indicate the rcason(s) for NOT accepting the scnsiUvity reviewer's 
recommcndaUons. Tunc construnts will not be considered sulTicient reason for not changing material, The 
project director's sUtemcnts should explain how and why the material docs NOT violate the guidelines The 
project director's sUtements should be documented, including appropnatc references to the Guuleimfs and 
Procedures and copies of rdcvant ^lectficaUofts when appropriate. 

The draft puWicauon. sensiuvity review report form, and any explanatory memoranda from the sensi- 
tivity reviewer and the project ducctor are given to the pubhcations scnsiUvity review coordinator, who 
forwards them to the chair of the arbitraUon panel The panel may soliat other written matcnal itself 

An arbitration pand will be convened and a decision rendered usually within one week of noiificaUon 
Three of Ihc five arbiters will be askod to serve on a pand. Time charges are to be made to the projcct/job 
of the publicaU'on under considcraUon 

The arbiters have received ETS sensiuvity trammg and have a copy of the procedures available at the 
meeting. The panel can decide one of the following- 

I Material is m violauon of the guiddines and must be changed or dropped. 

2. Material is not m violauon of the guidehnes. 

3 The guidelines do not address and are not relevant to the probJem raised by the reviewer 



* All arbitntion occurs through wnttcn maicmb only There are do oral arguments by athcr pany before the pane! 
of arbiters 



7 




223 



In rc\ic>Mng the matcnal. arbiters may discover that additional arca^ not cited by the rc>icu-cr violate a 
guideline It is the duty of the arbitration panel to rule on that inatenal also. The arbitration process is 
intended to provide a mechanism for the resolution of disagreements betv^ccn project director and sensitivity 
rcviewrr However, as the major goal of the sensitivity review process is the elimination of offensive mate- 
rial, arbiters would be remiss if they were not to point out and rule on matenal brought before them that is 
in violation of any part of the guidelines 

Once the arbitration panel has made its binding decision, its members sign the arbitration control sheet 
and the publications sensitivity review fomi and return them to the publications sensitivity review coordina- 
tor Copies of the dcciMon. together with the matenal under arbitration and the memoranda wniten by the 
project director and the sensitivity reviewer, are sent by the publications coordinator to the project director's 
division head, the stecnng committee, the project director, and the sensitivity reviewer The sensitivity review 
coordinator will ensure that the changes, where necessary, arc implemented, 



The success of the sensitivity review process depends upon the consistent implementation of clear and 
established policies It is necessary that reviewers and editors be familiar with all of the guidelines 
discussed below to ensure thai all people and groups arc treated fairly in tests and publications and that all 
test programs and clients arc asked to comply with the same standards 



Croup Reference Qucsuon\ reflect the multicultural nature of our society and 
arc of two basic types' represtniaitonal and subsiantiH' . 

Representational items 

These items test knowledge or skills that are independent of the particular 
subject matter presented in the sumulus matenal or in the item itself. Such 
Items arc generally found in tests mcasunng listening skills, reading compre- 
henston. problem-solving in mathematics, wnting ability, interpretation of 
data, and the like. For example, if the purpose of the item is to test whether a 
candtdatc knows how to read a bar graph, what the bar graph itself indicates 
IS irrelevant; the same skill can be measured whether the graph compares the 
number of cars manufactured by difTercnt companies, the number of people 
who arc m the vanous income tax brackets, or the number of Hispanic men 
and women who have earned doctorates each year during the past decade. 
Usually. Items in this category can be changed without great difiiculty to 
include references to women or mtnonty groups. 



!. Skill — Identification of an error m grammar 
Original sentence: 

Henry Fielding ts widely known and highly praised for his novels ^ few 

A B 
people reali^ that he estabhshed the first police force m England 

No error 
E 

Revised Item that includes a representational women's reference: 

Gwendolyn Brooks ts widely known and highly praised for her poetry^ 
A B 
few people realize that she has also published a novel. No error 



EVALUATION GUIDELINES 



Definitions 



Examples 



C 



D 



E 



8 




224 



2. Skill— Ability to read a simple chart 

OrisinI clutft fra vilOcli qacstkM wm dnm^ 

Number of Test Development Specialists in State X's Employ 



!5 



0 

1965 1970 1975 

New chart Ih^ iKiiite a ivprcMitatfonI group rtfereoce: 
Number of Hispanic Americans Holding Professional Job; m 
State X's Govcnuncnt 

ISOO 

1000 

500 ^^''^'^''''^ 



1965 1970 1975 

3. SkiU— Spatial orientaUon 
OrltiwditeM: 

Jun rowed I kilometer cast and then I kilometer south. In \^hal dircctic:. 
would Jun have to row in order to return directly to his starting poinf* 
(A) North (B) Northeast (Q Northwest 
(D) Southeast (E) Southwest 

Rerisc^ ftcai that Uadcs a rcynaeatatioMl Asian-Aiiicrican reference: 
MrChyn rowed I IciJometer cast and then 1 kilometer south. In what 
direction would he have to row in order to return directly to his starlinc 
point? ^ 

(A) North (B) Northeast (Q Northwest 
(D) Southeast (E) Southwest 

4. Skill — Reasoning 
OriglsalitMi: 

If a man who had visited the United Sutes in the 1830s wrote, "Pfeople in 
America were unusually fnendly," you would probably give the most 
credence to his judgment about American people if you also found that 

(A) Americans of the time condemned the idea that America was a 
happy-go-lucky culture 

(B) ministers m the 1830s insisted that puntamsm was declining 

(Q other travelers in the 1830s who came from the same culture as the 
author had come to the same conclusion 

(D) other travelers m the 1830s who came from many different cultures 
had come to the same conclusion as the author 

(E) the first American social club was founded in the 1830s 



225 



Revised item that includes a representational women's reference: 
If a woman who hdd visited the United States in the 1830s wrote 
"Unmarried uomen in Amcnca were unusually emancipated;* you would 
probably give the most credence lo her judgement about these women if 
you also found that 

(A) wcial psychologists in the 1980s contend that women in the United 
Slates arc more emancipated than women in most soaclici 

(B) United Stales writers of novels m the 1830s dcscnbed some women 
characters who refused to follow established rules of conduct 

(C) in the 1830s. another traveler, who came from the same culture as the 
author, had come to the same conclusion 

(D) in the 1830s. men and women travelers, who came from many 
different cultures, had come to the same conclusion as the author 

(E) the first suffragist newspaper In the United Slates was founded in the 



Subsldnlive items lest particular kinds of knowledge These items are 
usujily found in tests meant to measure knowledge gained in a particular 
course of study m a particular disaplinc Substantive items related to the 
concerns of minonty groups and women arc included m the test according to 
the requirements of the test specifications, which, of course, arc intended to 
rcflcci what is being taught in the discipline Some of these items may cover 
subjects that can be expected lo arouse negative emotional reactions in 
certain subgroups of the population and thus would not bcappropnatc 
subjects to cover in representational items For example, a test in Amcncan 
history would probably deal with slavery, a test for nurses might include 
Items about sicklc-ccll anemia or Tay-Sachs disease. All such items should be 
reviewed for sensitivity concerns in light of the purpose of the test, the popu- 
lation taking It, and the curnculum it is designed to test 



1830s 



Definition 



Subsiantive Item 




226 



Evaluation Requirements 

Allqucstioru, indudmg group reference quesuons. and where applicable entire icsl sccuons or tcsls. arc 
evaluated from a number of perspectives. 

Cognitive/Afiective 

These two dimensions apply to all group refcraice questions. The cognitive dimension deals with the 
factual basis of quesUons, , e.. whether the information ,n ihequesUon is accurate. The affective dimension 
rellccls the posiUve or negative fcdings the question may evoke from various segments of the testing popu- 
lation. There are four possible combinaUons of these two factors, illustrated by the following chart 

COGNITIVE 

A 

F 
F 
E 
C 
T 
I 

V 
E 







Erroneous 




1 




->- positive 


a 


b 


-negative 


d 


c 



Category 

ACCEPTABLE: Category "a" represents the ideal situation— the group 
Ttfereooe is both factual and affectively positive. 
Example 

The economic health of the Osage took a dramatic turn for the better when 

(A) they succeeded in producing an especially fine variety of cotton 

(B) pooled tribal resources provided the capital to establish a penal factory 

(C) oil was discovered on their reservation 

(D) high-fashion designers displayed an interest in their finely crafted 
jewelry 

(E) concern for the environment led to a general interest in handcrafted 
goods 

Category "b** 

UNACCEPTABLE: Category "b" qucsUons, while evoking posiUve feelings 
on the part of referenced groups, are not factual. Such 
questions frequently result from the intcnuonal eflTons 
of the person writing the question to correct a perceived 
iryustice to a mmority group and often represent a nar- 
row ideological perspective. Additionally, these ques- 
tions tend not to have clear-cut correct answers. In most 
cases, unacceptable questums can be salvaged by revision 

Exaaple 

Which of the following groups has been most successful in obtaining progress 
for the Black community? 

(A) The Urban League 

(B) The Black Panther Piarty 

(C) The Deacons for Defense 

(D) The National Association for the Advancement of Colored Ptopic 



11 



220 



227 



Discussion 

The problem here is with ihe question itself Unfortunately, it isunan* 
svverable as wntten. What is meant by "most successful"? At what? What 
type of progress? The writer clearly had good intentions. The objective was to 
include positive matenal on the minonly expcnencc. A question of this type 
can be rcwntten in several ways For example; Which of the following groups 
emphasis progress through alliances with the business community? Tlie 
answer is the Urban League. Or' Which of the following groups is the oldest' 
The answer is the NAACP. Both questions as rewntten have a factual as 
opposed to a subjective answer Notice that as used in option (D) the word 
Colored IS acceptable and appropnate here. 

Category "c" 

UNACCEPTABLE* This third set represents the worst case These questions 
are not factual, and they generate negative feelings on 
the part of referenced groups 

Example 

All of the following groups have retained some of their onginal cultural roots 
EXCEPT the 

(A) Swedish Amencans 

(B) Italian Amencans 

(C) Black Amencans 

(D) Amencan Indians 

Discussion 

The author of this question intended for the answer to be (C). However, 
one school of thought on this issue traces the roots of Black'American cul* 
ture clearly back to Africa. Black Amencans who support this alternative 
viewpoint would react negatively to the question as wntten. Therefore, the 
question should be dropped or reworded to read According to E. Franklin 
Frazier (or some other proponent^ which of the following groups has not 
maintained vestiges of its original cultural heritage? 

Category "d" 

UNACCEPTABLE* Questions that fall into this fourth group often lead to a 
controversy that is difllcuU to resolve. Although such 
questions are based on fact, they generate negative feel' 
ings on the part of referenced groups. For instance, a 
question that emphasizes high birth rates in certain 
nations has a factual basis, but it may evoke negative 
feelings in Americans who can trace their roots to these 
nations, and it reinforces negative stereotypes.^ 

Example 

All of the following factors account for the use of English as the oHlcial 
language of the United States EXCEPT: 

(A) It IS required by a constitutional amendment 

(B) It IS the primary language of instruction in public schools. 

(C) It is the key to the "Americanization" of non<English<speaking 
immigrant groups. 

(D) It IS usually nccessar>' for career success. 

(E) It prevents the emergence of balkanization and separation. 

Discussion 

Ev^n though choices B'E are true, the question can oficnd Amencan 
citizens who are not native speakers of English, as well as recent immigrant 
groups Choices C and E show an intolerance of other cultures. 



*• In exceptional mstancts. matenal or this nature may be unavoidable Sec section on context considerations 



2Ci 



228 



Exaaple 

Ptrecnt of FcmalcHcaded Rmulics in the United Sutcs in I960 by 
Annual rncomc, Race, and Place of Residence 





Rural 


Urban 


Tola! 




Percent 


Pcrocnl 


Percent 


Black PopulAtion 








Under $3,000 


18 


47 


36 


$3,000 and over 


S 


8 


7 


ToUl 


14 


23 


21 


While Population 








Under $3,000 


12 


38 


22 


$3,000 and over 


2 


4 


3 


ToUl 


4 


7 


6 



TTic data in the tabic above indicate that in the United Sutcs in I960 female 
headed families *Mre more coounon 

(A) in rural areas than in urban areas 

(B) among Whites than among Blacks at the same income level 

(C) among poor Whites i!ian among nonpoor Blacks 

(D) among the poor th/n the nonpoor only in urban areas 

(E) among Blacks tha'i Whites in urban areas but not in rural areas 

DbankM 

This item was 'vtitten for a general background test , It is unacceptable 
in that It 

• presents a 'tegauve picture of the minority group discussed in the item, 

• may arouse negative fedmgs m test takers, 

• IS not measuring knowledge of information essential m a disciphne, 

• IS intended for a general population (not students of a particular 
curriculum), and 

• docs not treat the subject with as much scosilivity as it could be treated 

Example 

Population growth rates tend to be highest 

(A) among the poor 

(B) in industrial countries 

(C) m areas with rich food supplies 

(D) when birth rates are low and death rates are high 

(E) when a nation undergoes a period of severe economic depression 

This item was written for a test intended for postgraduates with a speaal 
interest m political affairs, economics, and socul structures m the United 
States and throughout the worW. In fact, candidates taking th? test are 
expected to demonstrate more than average competence in answcnng ques- 
tions in these areas. Given the special purpose of the test, the population, and- 
the treatment of the subject in the question, the item is acceptable for the test. 



13 



232 



229 



Controversial Material 

Hjghl) conlro\«rs»al issues, such as Icg^tliwd abortion or hypotheses about genetic mfenontv. must not 
be included in an> test question unless such issues arc both relevant ard csscnttal to the content validit> of 
the test If such matenal is to be used, the question must be constructed to indicate dcarl) Us relationship to 
the content validity of the test Several methods for accomplishmg this w.thm the sensitivity guidelines are 

• Identify the source For example, one could begin a question \Mth "According to (source) " or 
**ln tht opinion of (source) 

• Phrase the question m such a way as to require an in-depth knowledge of the subject matter. 

• Balance the first controversial question with another that either refutes the first or presents an 
aKemaliv-e point ofvieNv 



Examinee Perspective 

All group reference questions are reviev^td from the perspective o\ test take's who may not have access 
to the correct answers When an examinee must know the correct answer to prevent a question from 
reinforang negative attitudes or stereotypes, the question should be revised or rejected This is because 
examinees who select a wrong option arc not routinely informed that their response was incorrect i hus 
their behef in the legitimacy of a negative attitude may be reinforced 

In evaluating perspccuve. the sensitivity reviewer must recoRni?e that there will be instances, particu- 
larly in conicni-bascd tests, where negative statements must appear For example, negative statements are to 
he experted in literature tests, cspeaally matenal deahng with satire or irony, where the author's statements 
may address either individuals or groups Similarly, m sociology, history, or economics tests, conflicts or 
de\clopment patterns frequently require knowledge about, or interpretations of. social and/or cultural docu- 
ments and concepts that may seem olTensive to individuals or groups In such instances, the test assembler 
must be able to demonstrate that a potentially offtusive option is a legitimate part of I) accurately interpret- 
ing a required kind of stimulus matenal or 2) accurately demonstrating an understandmg of the knowledge 
base of a particular discipline. The assembler's inability to demonstrate such points will suggest that the 
dislracier should be revised 



In general, the sensitivity reviewer should determine whether there is a suitable balance of multicultural 
matenal \n final forms of a lest or test section. In tests that largely test skills, such as mathematics aptitude 
tests and wnting ability tests, the numbers of references to males and females in items that refer to people 
should be approximately equal Such tests should also contain references to one or more minonty groups, at 
least meeting the test's own specifications on multicultural representation If such a test consists entirely of a 
small number of passages, such as some reading comprehension tests, balance requirements shouW be 
applied less slnngently— for example, if one out of three passages focuses on either women or a minonty 
group, the test's balance is acceptable 

Tests that largely assess content should meet their own specifications on sex-rvierenced and multicul- 
tural matenal If the test's speaficalions do not refer to women and minonlj s«^oups. ETS's corporate 
guideline requinng "the inclusion of matenal reflecting the cultural background and contnbutions of major 
population subgroups" should be followed For example, women and minonty groups *■ ^uld be mentioned 
in Items that test skills (for example, as the topic of a graph in a graph reading item in an economics test) 

Tests that assess a mixture of content and skills should be evaluated individually, applying the spin! of 
this guideline Such tests may include curnculum- based skills tests such as the interpretation and analy* » of 
hierature. and occupation- based skills tests, such as police otTicers' examinations 

In all tests, it is desirable to refer to more than one minonty group, rather than focusing all items ma 
single minonty group 

Because many programs use pretesting to build and augment question pools for the assembly of scored 
tests meeting stnct content and statistical specifications, the sensitivity reviewer cannot require that a prrtrst 
be balanced in its representation of either women or minontics if the pretest specifcations do not specifi- 
cally require such matenal Notation of pretest speafications should be made by the test assembler on the 
test sensitivity review report form 



Balance 



14 




230 



In judging the balance of a Icsi. ihc sensiuvity revKwcr should consider noi only the numcnca! balance 
of sex and mmoniy rcprcicnUiion bui also a more holistic appraisal cf ihc overall impression that is made 
by the tcsfs references lo women and minority groups The wtiys in which men. women, and mmoniy group 
memben are portrayed and the strength of the vanous references are among the factors lo consider in 
making such a h'-'istic appraisal 

For compuierwcd, sdf-sdccting. or branching tests, the entire pool of items should be reviewed before 
the system is used. At that timc» the pool should be evaluated to sec if it contains an aacpiable balance 

References to persons with disabilities are not pirt of the balance requirement for any test. 

Stereotyping 

Sensitivity reviews must ensure that no test implies that a r ^ ,$ culturally or biolopcally 

inferior or superior to any other -^oi t. Thus, the review shoulo . . material contains language 

or symbols that reinforce offensi. stereotypes. Such stereotypes gt,. « , suggest the physical (e g . height, 
weight, attractiveness, strength) a psychological (e g . intelligence, cthics, emotions, behavioral patterns) 
infenoniy of a particular group in some charactensiic considered desirable by the majority culture For 
example, matcnal that refers to the alleged predominance of lulian Amcncans in orgamrcd crime implies 
Ihal Italian Americans are dishonest. Occasionally, an offensive stereotype implies a supenonty of one 
group over another. For example; many would view as offensive a question that imphes that mates arc 
better drivers than females. Material judged to contain ranguage or symbols that reinforce offensivxj stereo- 
types IS not acceptable. (Sec Appendu A for cxamptes of offensive stereotypes,) 

It 1$ also important to avoid stereotyping women or a minority group by portraying them in only one 
role, especially if it is a stereotypicil role. Instead, they should be portrayed engaging in many difTercni 
activities For example, if a woman is engaged in a traditional activity hke child-reanng in one item, it is 
desirable to have one or more items in the test in which women arc engaged m less traditional aciiviiics. 
such as working as a lawyer or business executive. 

In evaluating stereotypes, the sensitivity reviewer must recognize that there will be instances where 
stereotypes arc likely to appear as part of the content. rdaied material of a test. For example, there may be 
instances where a social worker must know common stereotypes in order to deal with soaal problems or 
where a historian must be aware of stereotypes in order to accurately interpret historical documents Here, 
as in evaluating perspective, the assembler must be 2ble to demonstrate that the presence of a stereotype and 
the test taker's ability to recognize and interpret it are required by the discipline 

Caution Words and Phrases 

Sensitivity reviews should reflect the fact that even words with legi;imate uses can sometimes appear in 
contexts that make them unacocpuble. Through experience, sensitivity reviewers have teamed that certain 
key words and phrases often accompany sensitive material Thus, although the use of these words and 
phrases is proper and tegitimatc, the appcanncc of these words indicates that the material requires speaal 
attention bccauseof an increased potential for oftcnsiveness Fjiamplesof such words are /oHer-f/<wi. rfu- 
cnmmiion, and race, (See Appendix B for a more comprehensive list of examptes ) 

Special Review Criteria for Women's Concerns 

Sensitivity reviewers should seek to identify and eliminate all language that discriminates on the basis 
of sex (See Appendix C foi detaikd requirements ) 



Special Review Criteria for References to People with Disabilities 

Sensitivity reviewers should seek to identify and eliminate all language thai discriminates on the basis 
of physical or menUt disabilities. (See Appendix D for detailed requirements ) 



15 



23 d 



231 



Underlying Assumptions 

Scnsiimly reviewers should atlempi to climmaic elhnoccninc or gender-based beliefs and prejudices 
An underlying assumption is a subtle secondary premise that reflects an individual's ethnoceninc beliefs 
Unaccepuble underlying assumptions include the following 

• That a group is deserving of a particular fate. 

• That a group IS by nature dependent on hdp from another group: 

• That a group lacks or has an excess of any given quality fairly common to humans. 

• That what may be a norm only in Western culture is "truth** or that European avilizalion is "better" 
than (as opposed to **difrerent*' from) other avilizations. 

• That a causal link exists between any group and poverty, cnme. intdligeiicc. and the like 

Context Considerations 

Sometimes the use of sensitive matcnal is unavoidable There are four areas in which this occurs with 
some frequency 

1 Historical Domain In order to measure an individuaPs knowledge of history, it may sometimes be 
desirable to quote from matenal written dunng a penod when social values differed markedly from 
today's For example, a passage descnbing the conditions of Southern Black people dunng the 
reconstruction penod may include the term "colored people" or *'Negro" While it is desirable to 
avoid the use of such matenal where possible, the sensitivity issues must be judged in the overall 
context in which they are presented. 

2 Literary Domain' Matenal that is designed to measure an individual's knowledge of literature or 
that quotes from works of hterature often contains similar problems For example, many passages 
from matenal wntlen before the 1970s may include constant use of the so<aIled *'genenc he " a style 
that was considered edilonally correct until recently Similarly, passages that deal with cultures other 
than the majonty culture may vary in purpose or in methods ot discussing ethnic ideals or attitudes 
In all such instances, the stimulus matenal and items must be reviev^'ed carefully 

3 Ugal Domain, Matenal drawn from legal sources may sometimes deal with sensitive areas For 
example, real estate tests may contain references to federal, sute. or local laws governing discnmina* 
lion in the mortgage nghls of EEO classes 

4 Health and Social Sciences Domains Certain examinations in these domains (including health pro- 
fessions, social work, and civil service) require knowledge of information that may be considered 
sensitive in other contexts For example, in nursing tests it may be necessary to test one's knowledge 
of the predominance of sickle<ell anemia among Black people or Tay>Sachs disease among Jewish 
families Social work and the civil service require knowledge on how to approach problems and/or 
counsel people in a wide vanely of social and cultural contexts 

Inclusion of potentially sensitive matenal depends on the consent of the entire test or publtcation 
Given an appropnate context, use of certain matenal may be justifiable. It is important to recognize that 
many subject >ma Iter tests must include informauon and concepts that have a great potenual for raising 
sensitivity issues The test assembler and the sensitivity reviewer are responsible ior working together to 
develop test matenal that covers necessary subject matter (such as slavery, the Japanese-Amencan intern- 
ment dunng World War 11. ethnic components of social problems) in a theoretically balanced, sensitive, and 
objective manner 

Elitism, Ethnocentriclty. and Related Problems 

To eliminate concepts, words, phrases, or examples that may upset or otherwise disadvantage a test 
taker, every effort is made not to include expressions that might be more familiar to members of a particular 
sooal class or ethnic group than the general population, such as "soul food" and *' trust fund,'* unless they 
are defined or knowledge of them is relevant to the purpose of the test Words and sentence constructions 
that could have different meanings to different ethnic or geographic groups must be avoided. Care must also 
be taken to assess the appropnaleness of dialect, slang, and non-Fnglish words and phrases, such as 
"baim.'' "stickball." and "maven." which tend to be more familiar to certain ethnic, geographic, or other 
subgroups of English speakers 



16 



232 



APPENDIX A 



GUiDEKNES FOR RECOGNITION 

OF UNACCEPTABLE STEREOTYPES 

This appendu hsU some of the stereotypes that have been identified by members of major population 
groups Whai reviewing these examples, it is importanl to undersUnd thai they are not intended to serve as 
an cxhausnw list of all possible stereotypes but rather to illusiraic some of the more commonly encouniercd 
ones 

AUhough some words in this appendix n»y seem dated or may appear infrequently in contemporary 
texts, they are found m many sources (such as literary passages, hisioncal documents and cartoons, and 
popular publications) that provide stimulus material for test questions. Such words are listed here in order 
to remind the reviewer that they, and words like them, must always be carefully evaluated, regardless of the 
context m which they appear. 

1 No populaUon group should be depicted through language or symbols as superior or infenor with 
regard to 

contnbution to society intelhgence 
educauon leadership ability 

emotional slabihiy morahiy 
hon«»ty physical appearance 

industnousncss physical capabihues 

2 No populauon group should be depicted through language or symbols as fixated on instant graUfi- 
caUon (unable to plan for the future). 

3. No population group should be depicted as unable to mix socially with oiher groups 
4 No populaUon group should be depicted as superior or infenor in its soaal ir^iiiuiions. social 
orgamzauons, . r social structures 



Examples of Unacceptable Stereotypes 

I That Asian/Ruafic Americans: 

• are only suited for certain vocations and professions (c g . food service work, laundry work, 
mathonatjcs); 

• speak "pidgin" English, 

• are short, skinny, and wear glasses, 

• subsist on chop sucy, fnod ncc, haoal tea. 

• hvc or prefer to Uvc in ethnic neighborhoods (e g , Chinatown, Uitlc Seoul), 

• are predominantly refugees; 

• marry in accordance with family wishes or as a result of a prearranged agreement between families. 

• practice polygamy; 

• favor sons over daughters and first sons over all other siblings, 

• have httlc regard for human hfe; 

• use narcotics, particularly opium and its JenvaUvcs, 

• require women to be passive and submissive, 

• all share the same bsisic culture (as opposed to recognmng substantial cultural vanations that exist m 
the hentagcs of Asian/Fadfic AmetKans) 



17 



233 



2 That Black Amencans 

• are only suncd for certain vocations and proressions (c g , sports, music, teaching), 

• arc less prepared or less adequate as proressionais. 

• comprise the majonty or individuals receiving ueirare support. 

• (males) often desert iheir families; 

• are not punctual. 

• frequently engage in ct\il disorders and looting. 

• b\t exclusively in depressed urban areas: 

• are hcenlious (overpopulate. routinely engage m sexual relations at a young age. etc ). 

• are unaware of thar African hentage: 

• gamble excessively; 

• dnnk excessively: 

• have an inherently supcnor sense of rhythm. 

• speak "Black" language. 

• excel in physical as opposed lo intellectual endeavors 
3, That Hispanic Amencans* 

• are only suited for certain vocations and professions (e g . sci ,ice ^vork. agricultural work). 

• are licentious (ov^rpopulate. routinely engage in sexual relations at a young age, etc ). 

• are violent or bloodthirsty (bullfighting, revolutionary, etc.). 

• receive a disproportionately high percentage of welfare suppon. 

• speak dialects of Spanish unintelligible to other Hispanic groups. 

• are not punctual and frequently procrastinate (manana attitudes). 

• (men) physically dominate women (macho altitudes). 

• are all alike as opposed to recognizing cultural difTerenccs (e g . Puerto Rican. Cuban. Chicano. etc ) 

4 That Native Amencans/Amencan Indians 

• are unable to handle alcoholic beverages. 

• are "closer to nature" ihan other Amencans. 

• live in teepees and/or slums; 

• lack ihe ability lo deal wiih modem technology. 

• lack Ihe ability to deal with intellectual and academic endeavors. 

• are unusually hostile, violent, or apathetic. 

• are all alike (as opposed lo recognizing substantial vanations among and within the Indian Nations). 

5 That women* 

• are only suited for certain vocations and professions (e g . elementary school teaching, nursing, secre- 
tanal. Iibranan). 

• are less prepared or less adequate as professionals than men. 

• arc weak, fragile, or passive. 

• are overly emotional or hystencal (panic in cnscs). 

• are disorganized, illogical, or scatterbrained. 

• frequently engage in gossip. 

• compete with each other; 

• lack basic mechanical ability (e g . can't dnve a car or fix a leaky fauct). 

• lack ability to excel at any activity (music, science, etc ), 

• are overly concerned with thar physical appearance. 

• are pushy. 

• lack qualities of leadership (i e , self-confidence, ambition, and assertiveness). 

• lack basic ability in mathematics 

6 That persons with disabilities 

• are helpless or les» able than others who take care of themselves. 

• are to be pitied or patronized. 

• are nonproductive members of society. 

• arc in need of government assistance 



234 



Appendix B 



CAUTION WORDS AND PHRASES 

Expcncncc has shown that the following words and phrases frequently accompany scnsiUve matcnal. 
While the vast nujonty oTthese words and phrases art themselves legiumate and arc often used appropn- 
atcly. they lend to indicate an mcrotaod potential for the presence of offensive nutenal 

Although some words m this appendix may seem dated or may appear mfrcquenlly m contemporary 
texts, they arc found in many sources (such as htcrary passages, historical documents and cartoons, and 
popular publications) that provide stunulus material for test quesuons. Such words are hsicd here to remind 
the reviewer that they, and words like them, must always be carefully evaluated, rrgardless of the context m 
which they appear. 

1 Caution words and phrases with regard to all populaUon groups 



afllrmalive action 
backlash 

backward, backwardness 
barbanan. barbaric 
birthrate 
ovil disorder 
civilized 

class, lower class, middle dass, 

upper class 
colonialism, colonizod 
cnme, criminal, cnme rates 
culture, cultural Nas, 

* culturally deprived 

* culturally disadvantaged 
defiaent 

depnved 

developing nation 
deviance, devunt behavior 
dialect 

disadvantaged 

dtscnmtnalion 

emotional, emotionalism 

environment 

equality 

freedom 

gangs 

genetic, gene be inferionly, genetic 

supenonty 
ghetto 
ignorant 



illiteracy, tlhlerale. illiterates 

inequality 

infenor 

inner city 

instant gratification 

tntelbgcnoe, intelligence test 

juvcmic delinquency 

masses, the masses 

HKlting pot 

minonty 

nonwhite 

sing)e>parcnt family 

physical type, physical capahhties. 

physical charactenstics 
preferenual treatment 
pnmitive 
promiscuous 
race, racism 
not 
ntual 

social class, soaal development 
socioeconomic 
Third Worid 
unavihzcd 
underpnvilegod 
* underdeveloped naUons 
uneducated 
urban 

violent, violence 
welfare 



illegitimate 

2. Caution words and phnaa with regard to Asian/Paafic Island Amencans 
Asian Amencan(s) • Far East 

• Chink (demeaning abbfcvution of • Jap, Japs 

Chinese) Japanese 
Chinaman. Citinamcn • Oncnt 

Chinawoman, Chinawomen ••Oncntal(s) 

Radfic lslander(s) 

Note The disUnct terms Asum American, Pacific American, and Asian/Pacific Island American shoukJ 
be used according to accuracy and appropnateness 



These are generally unaoocpuble terms 
•♦WhcncNcr possible avoid using ihcse terms « nouns It is preferable lo use ihcm as adjectives, i e . Asian Amencans 
or Black people 



235 



3, Caution words and phrases wilh regard lo Black Amencans 



Afnca. Afncan 

Afncan(s), Afro«Amencan(s) 

Black 
••Blacks 

Black Amencans 

busing 
• colored, colored people 

desertion, desertion rales 

integrate, integration 



jun^e 
native 
' Negro, Negroes 
people of color 
pnmitive 

segregate, segregation 
slaves, slavery 

South Afnca. South Afnca n(s) 
tnbe, tnbal 



4 Caution words and phrases with regard to Hispanic Amencans: 



bamo 
bilingual 
Chicano(s) 
Cuba, Cuban 

Cuban(s). Cuban Amencan(s) 

extended family 

Hispanic 
••Hispanics 

Hispanic Amencan(s) 
•*Latin(s), Latin Amencan(s) 

Latino 

n^^cho, inachismo 

Caution words and phrases with regard to Nativr Amencans/Amencan Indians 
Aleut{s) (Use this form instead of Eskimo ) rative 
Amencan Indian(s) Native Amencan(s) 

• Eskimo(s) • rtdman, redmen 

Indian 

lnuit{s), Innuit(s) (Use this fonn instead 
of Eskimo ) 



• Mex 

Mexico, Mexican 
Mexican Amencan(s) 
nation, nations 
New Rican 

Puerto Rico. Puerto Rican 
Puerto Rican(s), Puerto Rican 

Amencan(s) 
Spanish, Spanish Amencan(s) 

• Tex-Mex 



reservations 

treaty, treaty pnvilege 

tnbe, tnbal 



Note The terms Namr Amrrtcan and Armrican Indian are both acceptable and may be used indepen- 
dently, as appropnate. 



* These are generally unacccptablcicrms 

••Whenever possible avoid using these terms as nouns l» is preferable lo use (hem as adjectives. « e . Asjan Amencans 
or Black people 



236 



6 Caution words and phrases with regard to women and men (&« jIso 
Appendix C) 



' bclicr half 
boy(s), boyish 

• coed (as a noun) 

• dislafTsidc 
dominecnng 

females, rcmtnine, feminist 

fnvolous 

gender 

he, his, him 

• housewife 
honKtnaker 
hysterical 
lady, ladyish 

libber, women's bbbcf 
maid, maiden 



male(s). masculine 

man, manly, manhood, men 

mjtnarch 

Miss, Mrs . Ms 

mother, mother-in-law, grandmother 
nosey 

old maidish 
patnarch 
picky 
pushy 

woman, womanly. \%omjnhood 
women 

sex, sexes, sexy 
she, her, hers 
stubborn 



Caution words and /irascs with regard to persons with disabilities (see also Appendix D) 

• amictea,':,.ructed paj^m 

• cnppled retarded 

I A^^aal?"^^ ^'^^^ handicapped 

dcronrncd . wheelchair bound, connncd/restrjcted 

drain/burden to wheelchairs 
normal 



• These are generally uniccepuble terms 

**or'd«f MudclIiT ** "^"^ " " prtferabic to use them as «J;cct,ve$, e g , A«an Amencans or Blick people 



237 



Appendix C 

SPECIAL REVIEW CRITERIA FOR WOMEN'S CONCERNS ^ 

1 Women must not bcdesctibed by physical aiinbuies when men arc being dcscnbed by mcnUl aimbulcs 
or professional po«lion. Irrelcvani rcferencts lo a man's or a woman's appearance, charm or intuiUon 
are not acceptable. 

2. In dcscnptions of women, a "pairomnng" or ' prhwatchmg ' tone is not acceptable, nor arc sexual 
innuendoes, jokes, or puns Examples of unacceptable practices arc focusing on physical appearance (a 
buxom blonde), using special fcmalc-gcnder word forms (poetess, aviamx, usherette), and treating 
women's issues as humorous or unimponani The following hsi ideniiHes a number of generally unac 
ceptable words and phrases and presents one or more acceptable substitutes for each case: 



Unacceptable 

the fair sex: the weaker sex. ihc disUfT 
side 

girL as m: I'll have my giri check that 
lady used as a modifier, as in latiy lay^yer 



Acceptable 
women 

111 have my secrctdry (or my assistant) check that (Or 
use the person's name ) 

lawyer (A woman may be jdentifled simply through the 
use of pronouns, as in The lawyer made her summation 
to the jury When gender modifiers arc required, use 
Homon Of female, as in a course on women wnters. or 
the airline's first female pilot 
wife, spouse 

author, poet (Some words like heroine or actress can be 
used if they seem appropnate m the given context.) 
sufTragist. usher, aviator (or pilol) 

feminist 

young woman; girl 
student 

homemaker for a person who works at home, or 
rephrase with a more precise or more inclusive term 
Identify the woman's profession, attorney Hlcn Smith; 
Mana Sanchez, a joumahst or editor or business execu* 
tivc or doctor or lawyer or agent, 
houvkcepcr. house or olTice cleaner 

3 In descnptions of men, espeaally men in the home, references to general ineptncss are not acceptable 
Men should not be charactenzed as dependent on women for meals, clumsy tn household maintenance 
or foohsh in self-care 

4 Women must be treated as part of the rule, not as the exception. Gcnenc terms, such as doctor and 
nurse, are assumed to include both men and women, and modified titles such as woman doctor or male 
nurse are not acceptable Stereotyping work activities as "woman's work" or a "man-sizcd" job is not 
acceptable. 



(he htlle N^oman; the better half, the ball 
and chain; and other such colloquialisms 
female, gender word forms, such as 
authoress or poetess 

fcmalc'gender or diminutive word forms, 

such as suffragette, usherette, anatrtx 

libber (a put-down) 

sweet young thmg 

coed (as a noun) 

housewife 

career girl or career woman 



cleaning woman, cleaning lady, or maid 



^.^.f ^L'"^/'""" McGraw HjII's Guukliwsfor Equal Treatment of the Sexes U«d with the permiiJion of McGraw- 
Hill Book Company (Sec also Appendix B, Section 6 ) 



238 



5 Women should be spoken of as participants in any action, not as passive bystanders Terms such as 
pioneer, farmer, and settler must not be used as though they apply only to adult males Examples 
follow. 

Unacceptable Acceptable 

Pioneers moved West, taking their wives Pioneer families moved West 

and children with them or 

Pioneer men and uomen (or pioneer couples) moved 
West, taking their children uith them 

6 Women must not be portra>ed as needing male permission in order to act or to exercise their nghts 
Example' 

Unacceptable Acceptable 

Jim Weiss allo\kS his wife to work part- Judy Weiss works part-time 

time 

7 The word nuin has long served to denote both a person of male gender and humanity at large To many 
people today, however, the word man is so closely assoaated with the first meaning (a male humar 
being) that it is no longer considered broad enough to be applied to a person of either gender There" 
fore, altematiw expressions must be used in place of mo/i (or denvative constructions used genencally 
to signify humanity at large) The following list identifies acceptable alternatives for man -uords 
Man'MOrd Preferred Alternatne 

mankind humanity, human beings, human race, people, 

humankind 

man's achievements human achievements 

If a man drove 50 miles at If a person (or dnver) drove 50 miles at 

Wmph . 60mph.. 

the best man for the job ihe best person (or candidate) for the job 

manmade artificial, synthetic, constructed, fabricated 

manpower human power, human energy, workers, work force, 

human resources, personnel 
grow to manhood grow to adulthood, grow to manhood or womanhood 

8 Use of the 5o<alled "genenc he" is unacceptable. Here, as elsewhere, historical context and/or direct 
quotations must be considered when evaluating matenal 

Passages chosen for reading comprehension items in admissions tests may not use the "gcnenc he" 
Tests or test sections composed of discrete items have the following possibilities 

A) If the Item has several references, balance the use of "he" and "she" within the item 

B) Change "he" to a speafic name' Sam, Jim, etc. Change other items to Jane, Cheryl, etc Then 
balance the items throughout the test or test section 

Finally^ note that a stem like "A man drove 50 miles , . " is not a "generic he" item Items like 
this need only be balanced with items like "A woman invested " Examples of other alternatives 
follow: 

Unacceptable Acceptable 

The average American dnnks his coffee The average Amencan dnnks black coflec. 
black 

Replace the mascubne pronoun with one, you, he or she. her or they, or people as appropnate 
Alternate male and female expressions and examples to establish a balance within an item 
Example 

Unacceptable Acceptable 

I've often heard supervisors say, "He's I've often heard supervisors say. "She's not the right 

not the nght man for the job," or, "He person for the |ob,'* or, "He lacks the qualifications for 

lacks the qualifications for success" success" 



23 



2 CO 



239 



9 Occupational or activity terms ending m man are not acceptable when they can include members of 
either sex. Excepuons can be made for references to a particular person Examples* 
Unaccrptahlf Acctptabk 

congressman member of Congress. reprcsenUUvc (but Congressman 

Koch and Congrcsswoman Hollzman arc aoccpuble) 

chairman chair, chairperson, the person presiding at (or chairing) a 

meeting, the presiding onker. head, leader, coordinator, 
moderator (Also acceptable are Chairwoman Shirley 
Chisholm and Chairman Mao ) 

(Note that "Chairman John Doe and Chairperson Jane Doe" is not an acceptable combination, since 

man and person are not parallel ) 

businessman business executive 

fireman firefighter 

mail earner, letter earner 

sales representative, salesperson, sales clerk 

insurance man insurance agent 

cameraman camera operator 

foreman supervisor 

10 Test Items that assume all test ukers to be male arc unacceptable 
Unacceptable Acceptable 

you and your wife you and your spouse 

when you shave in the moming when you bnish your teeth (or wash up) in the morning 

1 1 Parallel language must be used for women and men 
Unacceptable Acceptable 

the men and ladies the men and the women, tbe ladies and the gentlemen, 

the girls and the boys 
man and wife husband and wife 

(Note that lady and gentleman, wtfe and fmband, and mother and father are role w< Ladus should 
be used for women only when men are being referred Xoia gentlemen. Siimlariy. women should be 
called Hiw and mothers only when men arc referred to as husbands and fathers. Uke a male shopper a 
woman m a grocery store should be refenxd to as a customer and not as a housewife ) 

12. A wonjan must be referred to in a manner that is parallel with references to a man. Both should be 
called by their full names, by first or last name only, or by title Examples 
Unacceptable Acceptable 
Bobby Riggs and Billie Jean Bobby Riggs and Bilbe Jean King 

Billie Jean and Riggs BiIIic Jean and Bobby 

Mrs King and Riggs King and Riggs. Ms King (because she prefers Ms ) and 

Mr Riggs 

Mrs Mcir and Moshc Dayan Colda Mcir and Moshe Dayan 

13 Women should be identified by th«r own names (e g . Indira Gandhi), They should not be referred to in 
tcnns of their roks as wife, mother, wster. or daughter, nor should they be identified in terms of their 
mantal relationships unless paired up with similar references to men or such references are basic and 
necessary to cfTecUve measurement. 



240 



14 Pronouns must not be linked uith certjin wor 
always (or usually) female or male Bxamplcs 
Unaccfptahlr 

the consumer or shopper she 

the secrciar> she 

the brcadMinner his earnings 

15. Mates should not always he first in order of m 



k or occupations oi« the .iuumption that the worker is 
Aarp table 

consumers or shoppers the> 
secretaries they 

the breadwinner his or her earnings or bread- 
winners their earnings 



Appendix D 

SPECIAL REVIEW CRITERIA FOR REFERENCES 
TO PEOPLE WITH DISABILITIES' 



Sensitivity re>iewers should be particularly aware of the ways in which people with disabilities are 
portrayed People and their worth as individuals should be emphasized, not the disabling conditions they 
may have. Referring to people as their conditions is demeaning and inaccurate 

All terms that have negative connotations or that reinforce negative judgments (e.g , crippled man or 
crazy Homan) should be replaced with terms that are as objective as possible. No one who has a disability 
should be pictured as helpless or pitiful People who have disalMlitics may be parents, teachers, business 
owners, leaders in their communities— in short, responsible, productive members of soaety who are neither 
to be pitied nor patronized 

For general publications, as well as for tests in which sentences and reading passages contain general 
information but arc not testing knowledge of that information (e g , SAT and GRE scntencc<ompletion 
Items). It IS important to watch for labels attached to people Identifying a computer programmer as para* 
plegic or an anist as learning disabled, for instance, is probably gratuitous and irrelevant to the pKh 
grammer's or the artist's ability to function On the other hand, it may be acceptable in a test to have a 
reading passage that descnbes how one person sucessfully manages a panicular disability. 

Although there is considerable agreement among organizations that represent or are concerned about 
particular groups regarding language usage and appropnate terminology, m both instances difrerenccs of 
opinion still exist Sometimes usage that ETS would prefer to avoid may be part of the historic tule of an 
organization, e g , the Amencan Counal of the Blind In this instance, the word blind is used as a noun 
instead of an adjective, which is the generally preferred use, Sometimes it is the term itself that is no longer 
appropnate (e g , mental deficiency, afflicted) If an assoaation, journal, or publication has such a term m 
Its name (e g , the Amencan Assoaation on Mental Dcfiaency) then one must use the correct name of the 
organizaUon. However, the use of these terms shouM be avoided where it is appropnate to do so 

The following unacceptable terms and the preferred alternatives are meant to be guiddincs— nor abso- 
lute, inflexible standards Tests or other publications that deal specifically with teaching, diagnosis, or treat- 
ment may require using terms on the unacceptable list in order to convey technical information If so, the 
test assembler should check the "speaal considerations" box on the front of the test sensitivity review report 
form and note that the lest contains speaalizcd matenal and explain for whom the test is intended. A 
publications editor should note the spcaalized matenal on the publications sensitivity review form. 

Generally Unacceptable Pre/erred Alter nati \ es 

the use of a handicapping condition as adjectives a deaf 

as a noun; e g., the deaf, the student, a blind child. 

blind, the handicapped handicapped people 

afllicted/afilicted with/ person wno has people 

afllicted by/a miction who are alTccted by — 

confined to/rcstricted to person who uses a \iheclchair, 

a wheelchair/wheelchair bound person who gels around by 

whccl-chair; wheelchair user 

• In part dcHvcd from literature issued by (he Gilbert M and Martha H HitchcocV Center for Graduate Study and 
Professional Development, the University of Nebraska* Lincoln. School of Joumahsm. the Ontano March of Dimes 
brochure, the National Easter Seals Guidelines, the Cerebral Palsy Foundation, and a guidebook published by the 
Intemational Association of Business Communications 

25 



241 



cnpplc/cnpplcd 

deaf and dumb 
diagnose/diagnosed 

disease 

drain/bufden 



inilicted with/inilicted 
noimal 



patient (noun) 



retarded 
VKtim/victini of 

blind as a bat/crazy/cnp/dcformed/ 

danib/freak/gimp/insane/piuful/poor/ 

untortunate 



person who has a physical 
disatMlity. physically disabled 
people 

people who cannot hear or speak 
correct only to dcscnbc a 
condition, noi a person, the 
condition was diagnosed as — 
use the word condition or spcafy 
the name of a condition, such as a 
person who has multiple sclerosis 
person who has a condition that 
requires increased (or additional) 
responsibility (or care or 
intensive care) 

caused by — : disabled by — . 
people without disabilities, 
nondisablcd people; 
nonhandicappcd person 
use only to refer to a person 
who IS being treated by a 
physician at home or v,ho is in a 
hospital 

Sec note 3 below 

person who has ; people 

who expenence •••• 

these terms and others like them 

should NEVER be used 



A(j(jitional Notes 

1. There arc guide dogs and seeing<ye dogs for persons who are blind and hcanng<ar dogs for persons 
who are deaf *^ 

2. When people who are deaf communicate by the use of their hands, they may be described as signing 
People are descnbed as mtcrpretmg when they render what someone is saying ,nto sign language for a 
deaf person. 

3 In addition to the general guidelines discussed, the NTE Education of MenUlly RcUrdcd Students test 
committee has made several more decisions about appropnate and current terminology m the Held of 
mental retardation Among them' 

• Vx mentally retarded rather than n '^rded&hnc^ g . mentally retarded jtudents rather than retarded 
students 

• To specify degrees of mcnUl reUrdation, use the following- 

mildly retarded (educablc) student 
moderately reurded (trainable) student 
severely and profoundly retarded student 

• Use the term mildly mentaily retarded in place of cultural-familial retarded 

• Vfwi^*^?" ^^^^ *>^^ »han Do>^n's syndrome (m keeping with new terminology m the 1983 
AAMD Classification m Mental Retardation ) 

• Refer to occupational and vocaUonal education programs as career education programs 

• Where appropnate, use students rather than children in order to accurately reflect the ace range of those 
in special education programs. 



26 



242 



APPENDIX E 



SAMPLE FORMS 



>vlcv*t, Tcil A««cBt>)«r iNouH (111 In all InforMttoa (MultH aVov* Ih* 4<mVU lln* « 



Page I 



Fori r>«*t|n*ilct 

rte)ctl/Jot» 

TctI A«*ti^lct 



(•«t«v by lubjctc Biirrt •f«<t*lli( 

Q Sf»<t»l tM*l4*r*(lont (m* tn 
•Mihar touniry, |1vmi 

rlc«i« •p«(tfy («Ml4*t*Cl0A 



TtST SD<$lTlVin RtVIEV UrORT fOltM 

TUT SrtClrlCATIOKS 

1 Q »tn*l Form T««t Sf«cl(lc*itoM r«iutt« •«lttrulrttr«l MC«rUl 
(lMlw4tftt •ItMtKv |r»<jf* •t>4 yomtfi) 



Mlilcu)iur«l Mi«rt«) 



Q riMl rora TcK Sr*ctllc*(lM« 4* mc rc^utrt 
CtKb tiw cr* In (h« (••!) 

Q TM* I* • ytcicii (or lln«l f*r* with •Mtlft(*<l*M 4c*(iUt4 

Q T>il« rt*(c*t U r«4ulr«4 (• h«v« «ylll(tttct>C*l MttrUl 

Q T>ili rtcl««( U Mt rc^ulrH (• K«v« kvlt I tultural MIctlal 



Page 2 



T«»t S«Mlttvltr lUvlMcr 

oiTcow o» uvirw 

Q Tcir U aprrovt'. n« t»i*a|t*i required 
O Oi*ni** tta tatow*nJ*4 <*ca tcmw*!*) 
Q Tail la attcrcabia aa tcvlaa4 

D DItattct t«Multctf OB 

4atc 

D Taat tc(cttc4 to arbitration o* 



Data rccclvad 



Q TMa la « ^rtllvtAary ravtcw 
(bafora aditlni) 

Q TMa 1* a >ifk4«tonr taviM 
(■(tar a4ttla|) 



I<r<iiltc4 ilRAatur*! (liwlUalUl rcat ti 



ttaytabla to bath 



Sanattlwlty (avttwar 



''^ta Taat A«aaa*lar 

Total n«i*«r of itava l(i thla tcit 

ta tha boiaa balsw. Hit tha Itc* nvMbar In aath tara|ory An itca rafartini to sora than om ai>k|(ou» ahOuI4 ka Ilata4 
un4*r oot^ •vbttt^uf xantlontd In thot Itaa. th^t l«, on Itca aantlonlai a >latk wonan ihoul^ %* Ilata4 un4ar »ot>i (cMla 
srt4 llotk Aaarltana 

Total 



I Aaia 



I HIaraalt Aaarl 
I W<tlva *—rlt< 



OtSara (SHtKy) 



Tait lUvtaw 

Q Taat aaati Ita i»«tl(UaltoAa tor 
tntluotOft Of aultltvltural Mtatlal. 

Q Taat 4«t» not aaat Ita iM<l(i"t<*"* 
(or incluaton of aultltviltvtal Mtcrlal 



CoiiMflla Ttat S«n«ltivtty Rtvirwat 
rltaaa Indttata ttc* nuiAat, auu**l 
raaaon fot ta1vi««ttfl| ravlilen 



ItMi Kaviaw 



« Itaoa below 



□ N. <^ 

□ Saa to. 

□ Ko tc^nta 0. balan<a 

□ tta townta on balaiK* balow 



Rair(>Mt fro* Teat A«aa«*lar 
rlaaaa IMttata wticthar ravliloa haa baan M4a 
If ravlalon >iaa not baan Ba4c. 9Ut»» airlaln why. 



Pages 3-4 



Tait Saniltlvlty Ravlawat 



attatha4 oMatli Mtaia tndltata n\)M>* ^1 pagaa attathad 
Tait Asaaablar _______ 



ERIC 



243 



How to Fill Out the Blue S.R. Form 
Page I 

be J«',h^'.«?lcri!;?''' r«po»s,b,l.ty ,0 «c ,ha. (he cnUrc fHUi.on ab^AC the double l.ne ,s filled m 
oeiorc the test goes to a sensitivity rcvieuer 

Pjgc2 

I H>t Item numbers on the appropriate Imcs 

litl^fl''/'^^"^"^,'^^ r only the .tem number Do 

ctrf o tnT^ ^J"^^ "^"P"- '"^ or an .tcm based on a 

chart often Tnited Siales VKX-presidents ft not correct to enter the ,tcn. as havn^l^? maS 

\t appropriate boxes If. on page I. the test assembler has indicated that 

the test does not reqmrr muItKuItural matenal. leave the leU-hard lx>xes blank 
Pages 3-4 

I Wnte >our commcnU on iruiiVKlual items (or sets of items for a pa^gc) 
Sign 0(T 

1 lfthetc^lrcqulr- o changes, check the first box . tne lefihand side of the fir-' »v gc and sir. ofT on 
the test. Also, sign ofT on the appropnate hnc on ,he Tcm Assemblcrv Contro/sh«i 

!Ln"vo 'ST'T "'^ "'"'^ ^'^ t^Ohand side of the first page 

sign >our name on the nghthand side, and sign the Test Assembler's Con.rol Sheet 
5 If you and the test assembler cannot agree on changes and/or deletion*, check with your divisional 
sen«tmty review coordinator and makearran^ments to consult u.th the TDdTi^tri^) NOT 
the blue form or the Test Assen>blcr-$ Control S wet. ««™.«"r. i^/ rnui sign 



28 



244 

PUBLICATIONS SENSITIVITY REVIEW FORM 

RcMC\fc Dale .i....^—^ 



Pro;cci Director _ 



ScnsiDMty Reviewer _ 



Result* of Review doK D Revision recommended 
Reviewer's Comments 



Project Director's Response . 



Approved (Sensitivity Reviewer) _ 
Date 



ERIC 218 



245 



A^.!I¥I.'n JS."^ ATTACHED TO THE SENSITIVITY REVIEW FORM AND 
REMAIN A PART OF THE PERMANENT RECORD. 

Sensitivity Review Arbitration Control Sheet 

I have reviewed the 

D Test, form 



□ 



Publication, title: . 



and discussed ray comments with the asscmblcr/projcct director We have been unaWe to r«ach a 
satisfactory resoIuUon of n.y concerns as explained on the attached shect(s) 



Signed . 



Reviewer 



Lth a ^.K^il^fJlw '^'"^'^'"^ ^''^ ^^"^ "^"'•^•"•^y ^'^^^ ""able to 

reach a satisfactory resolution of my concerns as explained on the attached shcct{s). 



Signed . 



Assembler/Project Director 



I have been notified of the -yccd for arbitration on the test form/publication designated above 

Signed _ 



Sensitivity Review Coordinator 

I have discussed the disagreement as descnbcd ou the attached sheets with the assembler/project 
director and the reviewer I am aware that the matter is being sent to arbitration. 

Signed _ 



TD or Division Director 



Having reviewed the wntten attachments to this control sheet, ihc arbitration panel has decided 
as follows: (ConUnue on a separate sheet if needed ) actiucu 



Signed I . . 

2, . 
3 - 



30 



246 



APPENDIX F 



SENSITIVITY-RELATED SECTIONS OF ETS OFFICIAL DOCUMENTS 

ETS Standards for Quality and Fairness 
Test Development Guideline 

Prepare, with appropnatc ad\icc and rc\icw. spcaficaiions for each test thai co\cr the following 

• Scnsjtivny— requirements for malenals reflecting the cultural background and contnbulions of 
major population subgroups 

Test Development Guideline 

Review individual items, the test as a whole, and descriptive matcnals to assure that 

• language, symbols, words, phrases, and content that are generally regarded as sexist, racist, or other- 
wise potentially offensive, inappropriate, or negative toward major subgroups are eliminated 

Accountability Guideline 

Review publications and other matenals to eliminate language or material generally regarded as sexist, 
raast. or otherwise offensive or inappropirate 



31 



247 



Memorandura for: 



COLLEGE BOARD TEST DEVELOPMENT 
COrA TEST DEVbLOPMENT 
SHEP TEST DEVELOPMENT 

Ms* Ovyer 
Mr* Klm&el 
Mr. Klein 



Info, cc: Test and Publlcatl 



ons Editors 



Subject: Test Sensitivity Review: 
Guidelines for Tallying 
Balance 



Date: February 26, 1987 

From: R. w. Adams 
J. Hsla 

G. D. Saretzky 



be Placed after Append^/rlrth: o" ;„e"%:;rp:::e"d 

.enaitivuy review notebook. PIea ;wt"^^ txS'^^„ -r;r:l:d^^°" 

'■ JP*th';"'""*,f"^ 8""=ll"e» «e a part 

?„.,.iV""" Even though the 

In othe%"," repreaenta f:„. 

proc«,r " 

^' eH^l!;*!' 'o be reviewed to 

required to K s'r ''r^"*"' ^« 
required to te balanced In their repreientatlon 

of women or members of minority groSpa un ea" 

the preteat apeclf Icatlona specifically "aU 

for such material. (See page 14 if the 

Culdellnes an d Procedurpw .t 

3. Test assemblers are urged to Inform committees 
consultants, collaborators, and survey ' 
recipients of the general ETS goals for 
c "I n'^hat T'?'' "P"""^-"on and to make 

"A: ^ writers understand the 

stand^rnT possible, the basic ETS 

standard for fairness and quality. 

/ddmh 

Attachments 



248 



Balance 



Test Type: CONTENT 

Definition: The CONTENT test Is designed primarily to measure knowledge chat 
is specific Co a subject; tests measuring knowledge of econoralcs» 
United States history, literary history, physics, and the like 
fall Into this category. It Is expected that such tests will 
have detailed specifications based on conmittee recommendations 
and reflecting, to the extent possible, current curricula. 

In general, the items In a CONTENT test ask questions that 
require the test taker to make use of course-specific Information 
to answer the question posed or to reason through to the correct 
answer. Examples Include, but are not limited to: 

In which of the following circumstances was the National 
Recovery Act proposed? 

Which of the following Is part of the BUI of Rights? 

Who wrote The Autobiography of Alice B. Toklas ^ 

Which of the following Is characteristic of the Gothic novel' 

Some Items in a CONTENT test may be testing not particular 
knowledge, but a particular skill such as the ability to 
Interpret data In a fona typically used In a discipline. For 
example, an economics test may require the interpretation of 
economic data In chart form, or a history test may require the 
Interpretation of a historical or demographic map. Nevertheless 
the presence of a small number of skills items in a CONTENT test 
does not change the classification of the test for the purposes 
of sensitivity review. 

Balance: CONTENT tests meet their own specifications and need not be 

balanced. They should* however. Include In their specifications 
an Indication of the way in which the test can reflect, wherever 
possible, the contributions of women and minority-group members 
to the discipline. An economics test, for example, may specify 
that two items will deal with the impact of women In the labor 
force and two Items with minority-owned businesses. If these 
Items are present in the test, the requirements of the CONTENT 
specifications have been met and the test Is acceptable. It is 
expected that In some subject-matter areas, for example literary 
history, there will be considerably more source material 
available concerning women and minority-group members than there 
will be In others, for example Latin or Classical History. In 
the event that the CONTENT test contains Items where he/she can 
be used Interchangeably, the test assembler should strive for a 
balance. 



2 •■' O 



249 



- 2 - 



u?g:dt^;\r^^\^?::ing: ^^"-"^ - ™^ ^"^^ -^-^"^ are 

^* E?s*'r^.nnl"r'*/"''"^ recipients, or other consultants about the 

^LJ ^ I /''^ ^"'^ the implications o? 

these standards for test content. 

Give Item writers clear Instructions about the need for 

Hnronr^'f ^''"^^ """^^^ * l«portantly. guidance to 

appropriate source material. 



"p^c5f[cafl^n«''"^5''' ''^"^"'^ "^"o^lty 

specifications. This u particularly Important for fUures uhoL 



^^''^hrHi^!"': "^"1 appropriate, to Include woaen and minorities 
«fUct iooln'« "h /'T specifically Intended to 

reflect women s and minority contributions or concerns. 

Include a copy of the test specifications In the workfolder. 




250 



Balance 



Test Type: SKILLS 

Definition: The SKILLS test Is designed prloarlly to measure a particular 



skill (reading, English usage, mathematical problem solving) that 
is necessary for academic work but Is assumed to be part of the 
test taker's skills preparation. The subject matter of the 
stimulus material has no special importance for a SKILLS test. 
For example, a sentence testing English subject-verb agreement can 
be about canaries, typewriters* or women novelists, just as a 
mathematics problem can ask about the height of telephone poles or 
of basketball players who may be male or female. 

There seem to be two fundamental types of SKILLS tests: 

1. Tests/sections composed exclusively of discrete items. 

2. Test/sections composed of some discrete Items and some items 
linked to the same stimulus (sets). 

Separate discussions for balance are given below. 



In SKILLS tests composed exclusively of discrete Items, the 
representation of males and females In people-related items should 
be approximately equal. At least 102 of the people-related items 
should be about minority group members and, whenever possible, 
more than one minority group should be represented. 

It must be recognized that some tests will deviate from these 
requirements for valid reasons. For example, a test designed 
exclusively for students In Bermuda may have an entirely different 
balance, perceived or otherwise, than tests designed for use in 
the United States or, like TOEFL, designed to test the language 
abilities of non-native speakers who intend to come to the United 
States. The coordinator of such a test must document such 
variations (e.g. Test for use exclusively within a non-United 
States population), and the test assembler must note the variation 
on the sensitivity review report forms. Still, in most such 
tests, a balance of male and female representative items should be 
the goal. 



For SKILLS tests composed primarily of sets related to a large 

number of stimulus passages, the people-related passages should, 
wherever possible, have approximately equal male-female 

representation. Ideally, there should also be about 102 of the 

people-related passages that concern various minorities. These 
criteria Miould be applied with more flexibility than they are in 
SKILLS testa composed exclusively of discrete Iteras. 

For SKILLS tests composed of a small number of stimulus passages, 
e.g. 3 as In GMAT or 4 as In GRE, only one of the passages needs 
to have either women or minority representation. 



Qalance: 



Discrete Items only 



Balance: 



Discretes and Sets 




251 



Balance 



Test Type: MIXED 



Definition: The general definition of a MIXED test Is that It prln,arlly 

defined and established stimulus material within a content area. 

^ '^^'^ ba,ed and 2;"* 



Balance: 



Ciirrl cut un~ based 



Tprp^rh , " ^ particular curriculum. For example 

abl y bl'TaT 'T.'r''' -"^'"''y - mte pretlve 

ability by means of specific French texts. Similarly. 

1 rntm J'' Engineering uses englneerlng-'^peclf Ic 
stimuli Identified as central to the discipline. 

Occupational 

The occupational SKILLS „sc prl-arlly measures knowledge and 

^'revl\"'o'?heX.?ion" ^"""'^ """"" '"o" 

For mixed C«Cs Che sensitivity review requirements tor balance 
must be used Judiciously. The ETS standards must be kept In ^ind 
but the unique orientation of the test must recognize^ as Sell' 

"^»en and »^ r^"'"" P<"»lbllUles exist for representation 
A TtZ V \ ^""^^ "'""fie") and developed as far 

as the Important subject-matter of the test will allow. 

Currlculua-baspd 

l^Jn^r^T*'"''' ^^^^ "^^^ generally conform to whatever 

or^n.l , "<^°»«"ded by external committees, consultants 

or Internal specialists. Although this domain may have its o^' 
^ r IcaTJf? ^'""'^ ^"^ literature. 20th cen u J ^ 

o1 ie tes ass «bf * I' "^^^^heless remains the responsibility 
^ssibir that, whenever reasonable and 

mi ;en''r r developed. Thus, whereas a 

the MmD test ^^r'S^ '"^ contecDporary Black women writers." 

HIT r'^ ^ 

oT e^S 
o"~ 

gul^U^e" " general 



252 



- 2 - 



There will be curriculum-based MIXED tests that combine 
significant numbers of skills Iteras and significant numbers of 
content Iteus In sections that may or nay not be separately timed. 
For example, a test on Spanish language and culture might have 60^ 
Spanish grammar and vocabulary items ( SKILLS because the subject 
matter of the Items could be anything) and 40Z Spanish history and 
culture Items ((INTENT because the Items measure knowledge of a 
specific subject). In this type of MIXED test, the skills 
material should be evaluated In a way generally consistent with 
the balance requirements for SKILLS tests. For example, If there 
are discrete Items dealing with graonar and vocabulary, those 
referring to people should be approximately equal in male-female 
references. Similarly, the content material should be evaluated 
In a way generally consistent with CONTENT tests; the assembler 
should provide detailed specifications for such content portions 
of the test. 

The sensitivity reviewer must remember that MIXED tests of 
this nature are likely to have a clear content b£se and general 
"culture" orientation. This means, therefore, that Spanish tests, 
for example, would be expected to have some representation of 
Hlspanlc-Aaerlcans but would not necessarily be expected to have 
Black American, Aslan/Paclf Ic American, or Native 
Amerlcan/Aaerlcan Indian representation In either the skills or 
content portions of the test. Similarly, a test like the Bermuda 
test, designed for a non-White and non-United States population, 
might be relatively free of any "minority" representation at all. 

Occupational 

An occupational MIXED test will generally conform to the 
activities, knowledge, and skills required by an occupation and 
Identified as central by consultants, committees, or internal 
specialists. Here, as with curriculum-based tests, the test 
assembler Is asked only to make certain that whatever reasonable 
possibilities for women's or minority representation exist arc 
used. 



253 



Tallying 



Guidelines for Tallying 



The basic rules for tallying, regardless of the type of test, are as 
follows : 



I* For discrete Items, a reference to a aale, a fesale, or a oember of 
a nlnorlty group in the stimulus , stem, or options means that the 
Item should be tallied under that ((roup. 

2* For discrete Items, a reference to both a male and 3 female, or to 
both a female and a member of a minority group, means that the item 
should be tallied once under each of the groups mentioned. 

3. For discrete Items, a reference to a Black woman should be tallied 
once for the Black category and onr*e for the female category; 
similarly a reference to a Hispanic man Is tallied once In both 
categories, and so on for other groups. 

4. For discrete Items, a reference within one item to several men 
several women, or several members of a minority group should be 
tallied only once for each group. 

5. Only United Stales minority groups should be tall^od. For example, 
a passage about the Japanese writer Hlshlma Yuklo should not be 
tallied as "Asian American" and a passage about the Maya should not 
be tallied as "American Indian." Such material can, for 
completeness, be listed under "other," but cannot be considered In 
determining whether specifications for balance have been met. 

6. References to male or female anlaals, birds, or mythological 
creatures should not be counted for balance, and the specific 
behavioral characteristics of such figures should not be extended to 
human males or females for any reason. Thus, the behavior of a 
female bird or a male bear Is In no way representative of or 
stereotypical of human behavior. 



Items 




7A-668 0 - 89 -9 



254 



TallyinR 

Passages 

1) For a stimulus or passage with a «et of items attached, the 
sensitivity reviewer should determine whether the stimulus/passage is 
about men, or wo»en, or a minority group. If it is and the passage Is 
in a SKILLS test, all of the attached Items should be tallied for that 
group. 

tn a CONTENT or MIXED test, correct identification of what the passage/ 
stimulus Is about is also Important. It is best In CONTENT or MIXED 
tests, however, to maintain a double count — one count for passages and 
one count for Items — since the items may or may not hava relevant 
references and should be counted individually. 

2) If a passage/stimulus incidentally mentions a man/woman/mlnorlty group 
but Is not about that fcan/ woman/minority group, the reference can be 
entered once In the Item tally (regardless of how many times the name 
of the man/woman/mlnorlty group appears) in the appropriate place on 
the review form. For example, if the name Martin Luther King Jr. 
appears twice in a passagt about non-^lolent resistance, the item 
tally should have one mark for Black and one mark for Male, but the 
passage should not be considered male-orlented. 

It should oe noted that, in a passage about Martin Luther King Jr., 
once that passage has been categorized the tally does net Increase 
o#ch time King's name appears in the passage. 

3) In a CONTENT or MIXED test, passages or works of art that, even though 
not Identified ss such, »re by women or members of minority groups 
must be counted in the tally. Thus, a painting by Mary Cassatt or an 
excerpt from a poem by Langston Hughes, even if no explicit reference 
Is made to women or minorities and even if the artist Is not 
Identified In the Items, must be counted for the purposes of balance. 

Test assemblers should always identify artists or authors who are 
women or members of a minority group on cards or flimsies. 

4) In a CONTENT or MIXED test, J passage that, for context/subject-matter 
considerations, oust contain "generic he" references or personified 
objects, should be classified with care. For example. If "the West 
Wind" Is personified as "he" In a poem about the West Wind, the 
general orientation might be considered male but each individual "he" 
should not be part of the tally. A single reference to a male West 
Wind in a poem about summer would not, however, aake the poem "male 
oriented" and a single pronoun probably should not be tallied. In a 
passage with "generic he" references, the passage should be classified 
as male oriented, but each he should not oe tallied. The Items for 
such a passage will be tallied according to their individual 
characteristics. 



255 



The ETS Sensitivity Review 
Process: A Commentary 
for Test Assemblers 



1^ 



256 



Table of Contents 



Introduction 3 

Intppropriate Language 3 

Inappropriate Subject Matter or Tone 4 

inappropriate Underlying Assumptions 7 

Stereotyping g 

Lack of Balance 9 

Juxtaposition U 

Judging the Items 11 



ERIC 



260 



257 



I was breezing along through a chapter on the American Revolution when I did a double 
take on one sentence. It w.is as if somebody had stuck a foot out there on the page and 
tripped my mind as it *ient by. I looked again, and this sentence jumped out at me: Despite 
tne hardships they suffered, most slaves enjoyed a higher standard of living and a better life 
in America thati in their primitive African homeland. As far as I can remember, this was the 
first dme I was ever enraged. 

—Bill Russell* 



• Russell, Bill, and Branch, Taylor Second Wind New York Random tlouse, 1979 



258 



Introduction 



Most of the items and tests the test sensitivity reviewer judges need no change to meet the ETS sensitivity 
guidcbnes. Howc\'cr, the revio^-er must be aware that some items may be flawed in terms of sensiuvity issues 
These flaws fall into Ave categones* 

(1) Inappropriate language 

(2) Inappropnate subject matter or tone 

(3) Inappropnate underlying assumptions 

(4) Stereotyping 

(5) Lack of balance 

The items and passages included in this section have been chosen f*) illustrate these basic flaws— or the 
lack of them. Some of these items and passages appeared in tests produced before the ETS test sensibvity 
review process went into cfTcct, some of them never appeared in tests at all. but were removed from the pool 
of available items and offered by test assemblers for use m training sensiuvity rcvicwcts. Comments following 
each of the items or passages are intended both to direct attenuon to the problems sensitivity reviewers found 
with the material and to help define what is and is not acceptable under the ETS sensiuvity review guidelines 



(1) INAPPROPRIATE LANGUAGE 



Example 

Owing to her detailed and perceptive study of tke OMdcni female, Genaainr Greer has bccoow a recog- 
nized spokesnan of the wonien*s liberatkw oaarciacat 
Begin with Her detailed. 

(A) having mtde 

(B) has made 

(C) made of 

(D) became 

(E) is becoming 

Commentary 

Spokesman is not the icmi that should be used here. Advocate or leader would be an acceptable substi- 
tute 



Example 

In an inner city» retail sales and employment are declining and low-Income and minority households are 
increasing. Wh»ch of the foUowing poUdes would be COUNTERPRODUCTIVE to restoring its economic 
health and vitality? 

(A) Approval of « new circumferentiai freeway ontsMe of the dty 

(B) Adoption of a regional tax-sharing plan forall new indutrial and commereial development 

(C) Concentration or federal grant monies for rent snpplcmcnts to iaoer-dty nreas 

(D) Diversion of highway trust monies to public transit improvements 

Commentary 

The tcnn mner city cames a great many connotations not necessary to the item Downtown area would 
avoi'* those connotations The inclusion oimmonty households in the stem is irrelcvanU the pertinent infor- 
mation IS contained in low-mcome households. 



3 



P \' 
\j ^ 



259 



Examp le 

Rosa Martinez's vituperative mien of the film cast douht on her abiUO to assess the worth of cinematic 
works because that film has been an overnight box office success. 

Commentary 

In this sentence tcsiing English usage, ihc designation of a Hispanic woman as a literary tntic was 
undoubtedly meant to show respect for both Hispanics and \\omcn Unfortunately. xttupvralt\e and the 
implication that the cnlic's judgment is valueless make the item affectivcl) negative. 



Example 

To deal with the problems raised by the v»omen's liberation movement, it demands basic changes in our 
assumptioDS about the organization of society. 

(A) It demands basic changes 

(B) basic changes are what It demands 

(C) there are basic changes demanded 

(D) people must make the basic changes 

(E) we must make basic changes 

Commentary 

Too oRen items dealing with minontics and women use the word prohlems. implying that the qut^t for 
avil nghts or job opportunity bnngs nothing but trouble and annoyance to the rest of the v^orU It v^i uid be 
best if ETS items avoided giving support to such a negative view of the changes brought about by the avil 
nghts and the wonr.cn s movements The phrase probUms rflwct/ could be ch.mgcd to opportumttes or chal- 
lenges preiented. 



Example 

Experience has shown that 75 percent of those hired for a certain job prove to be successful. A test is 
administered to 80 applicants and the 40 men with the highest test scores are hired. If it turns out that the test 
has zero validity, what percent of these men should be expected to be successful? 
(A) 0% (B) 40% (C) 50% (D) 75% (E) 80% 

Commentary 

The use of men in this item is unncccssa"^. confusing, and m violation of the ETS guidehnes A neutral 
word like applicants or leu takers can easily replave men 

(2) INAPPROPRIATE SUBJECT MATTER OR TONE 



Example 

Just as the— —of a new species of insects is certain to have 9 n^'jund effect on the of a river 

valley, so a large immigration of a new race or class is bound to destroy he social equilibrium of k city. 

(A) exodus, .topograph) 

(B) influx ecology 

(C) mutation. .geology 

(D) discovery. .population 

(E) extermination.. stahilit) 



260 



Commentary 

The sentence impbcs that jmmigranU of a difTercnt race or class arc. like destructive insects, bound to 
destroy the territory they enter These suggestions arc toully inappropnate in an ETS test The sentence is 
affectively negative. 



Example 

BoA candidates agreed that such minoritks must be giiM an opportwUty to adr&nce* to letk justice, and 
to tkc kind of special treatment Huit ndglit nakc up i« part for past iMqirities. 

Commentary 

This sentence (testing English usage) implies that nunonties arc passive, only awaiting the paternalism 
of the majonty to improve tncir lot in life The sentence might be revised as follows 
Both minonty candidates agreed that minonty people must take this opportunity to speak out, to seek 

justice, and to the kind of education that might enable them to make up, m part, for past 

inequities in employment. 



Example 

IVople have httm ia the Amcrkas for More tluui 3S,000 years. Whites have been aroond for lest thM Ave 
hundred. It is preMMpCwna for anyoec to prctcad that the ChicmM* the ""McxicM-AmerfouC is only m 
more in the long Unc of hyphenated luycranis to the New WorW. I reject the femtic gnnies of the 
sociologists iHto identiO^ ns as Mcxican-AnMfktM. Onr Iniirtrnci on catt^g ovrseKcs Cbkanos slenn from 
a realization that we ait not Jwt one hmcc Minority grony hi the United States. 

We arc, to begin with, a powetfnl Mend of h idig e noqa Aaiericn with Elropenn-Arabinn Spain. Vhtk^ the 
three h«i<red years of New Spain, only 3M,IM whites icttied in the New World, and nioot of ttese were 
Men. Tbere were so few white people at fint, duit ten yenis after the conqncst in 1531 there were hmkc Unck 
BMn in Mexico than white. AfHcans were br«i$M hi as abm and somi intefMarricd. Miscegenation went 
JoyoMly wild, creating Many bnct, shapes, and sizes, bnt the predonUnant strain remained India. 

TheninthetwiMghtofthe co ny ista do m ' MttationorNewSpain,the]ndloasnfrercdthtfateof a 
cokmlzed people. Rejected by the SpaniA father, they dmig to their Indian mother and shnred her ever- 
wheloUng sense of loos. Tht revohitiM of the thiilMn coMcs nf New E^land did not t^ 
dants of the Indios,mitfl half a ccntnry Inter. IfoTh^forwcd a nation, the colonies events looked sonth 
for their own conqnctte ant decided to "iftcnite** JtXM from Mexico. Mexico itself was Uecding fhmi 
ioteraal connict and was in^nipped to defend its peopte in the war that made Texas part of the new coHoCry. 
Amid this so-called UberatioB, the Americaa Indio remained forgotten. 

Who then are the residents of the United States known by the CUcano as AngbtsTHiey are traMplanted 
Europeans, with pretenrions of aathre origins. IMr aunt patriotic cry ia basically the retort of one hnmigraot 
to another. Feeling tndy American only when they are no longer the latest foreigners, they brandish their 
Americanisa by threatening the new arrival* If yon don*t like it here, go back where you came from. 

Now the Anglo is trying to impose the fannrigrant complex on the CMcano, pretending that the "Mexi- 
caiHAmericnns'' are the most recent arrival Bnt we will not be deceived. In the flaal analysis, frjjoles, 
tortillas, and chili are more American than the hamburger. We do not suffer from the immigrant complex. We 
left no teeming shores in Europe, impatient and eager to arrive in New Yorii. No Statue of Liberty ever 
greeted our arrival in this country. We did not, in fact, come to the United States it sll. The United States 
came to us. 

Commentary 

This reading passage was rejected pnmanly because of its inflammatory tone, which might be upsetting 
to vanous groups of test takers The matenal within it is controvf rsial and afTcctivcly negative, in this case, 
the matenal is potentially offensive to members of both the majority and minonty groups 



261 



Example 

The only Oriental boy in a dtss of five-) earmolds always looks down when tlie teacher addresses him. Of 
the following, the roost reasonable assumption the teacher can make about his behavior is that be 

(A) probably feeis guilty and thinks he has done something wrong 

(B) has learned to lower his eyes for a particular reason 

(C) may have trouble with hb vision and should have hb eyes checked 

(D) does not pay atteatloo when spoken to because he Is thinking of other things 

(E) may be emotionaUy disturbed and should be observed by the school ps)chologist 

Commentary 

The subject matter otthc item is appropnatc for a test given to teachers, who should be aware ot 
difTcrent cultural traditions among ihcir pupils The difTiculty with the item lies in the vague key. option B. 
and the afTcctivcly negative options, each ofwhich, when placed with the stem, is dcmeani :g or insulting 
The test taker who answers this incorrectly and who does not know that the option chosen is incorrect may 
well have negative ideas reinforced. 

The cthnocentnc word Oriental should be changed to the preferred designation, Asian- American The 
item can be revised in several ways to avoid the negative qualities of the options For example. 
Which of the following is the most pr(^ble reason why the boy looks down? 

(A) To concentrate better 

(B) To show respect 

(C) To avoid embarrassing the teacher 

(D) To ask permission to question the teacher 

(E) To avoid showing disagreement with the teacher's remarks 



Example 

Frequently there Is a time lag between the statement of a managerial policy and the implementation of 
that policy. This appears to be particularly true «irith regard to the acceptance of women in management 
positions. According to our survey findings, vomen interested In management or professional careers still face 
social and psychological barriers, despite recint charges in policies on the emptoyment of women. 

The responses we received to the case ex.imples reflect two general patterns of sex discrimination: (1) 
There is greater organizational concern for the careers of men than there is for those of women, and (2) There 
is a degree of skepticism about women*s abiliti<s to balance work and family demands. Underlying these 
patterns of discrimination there is an assumptifin that is not at first apparent from the survey findings: it 
appears that women are expected to change tc satisfy the organization's demands. For example, written 
comments from participating managers often suggest that women must become more assertive and indepen- 
dent before they can succeed in some of the situations described in the case examples in the survey. These 
managers do not see the organization s» having any obligation to alter its attitudes toward women. Neither, 
apparently, are organizations aboiu to change their expectations of men. Perhaps because it is expected that 
the job will eventually **win out*' over the family, a man is given the time and opportunity to resolve conflicts 
between home and job. This |p itself sa)s a great deal about how organizations might conceive of a man's 
relationship with his family. 

Another conclusion we can draw is that when information is scant and the situation ambiguous, managers 
tend to fall back on traditional concepts of male and female roles. Only when there are clear rules and 
qualifications do both v«omen and men stdnd a chance of breaking out of the stcreot)ped parts usually 
reserved for them. 

When the results of (his surve) sre extrapolated to the total population of American managers, even a 
small bias against women could represent a great man) unintentional discriminatory acts that potentially 
affect thousands of career women. The end result of these various forms of bias might be great personal 
damage for individuals and ca«tl) underutilization of human resources. If managers arc sincere in wanting to 
encourage all emplo)ees equally, the) ought to examine their own organizations* implicit expectations of both 
men and women to see whether these expectations reflect some of the same traditional notions revealed by the 
survey. Identification of these biases would help managers to mo\e toward the goal of equal employment 
opportunity for all. 



262 



Commentary 

This passage was undoubtedly chosen as appropnate for showing women's concerns in a te:t meant for 
applicants to graduate schools of business. Unfortunately, the passage docs not present a positive picture of 
what women entenng business management can expect and can therefore be considered affectively negative 
for women taking that test It might not be considered mappropnate in another context, however In aoy 
case, it would be better to use the term **women" rather than "career women ** 



In order to work cflcctivcly with members of a mioority group, the most imporunt comidcratk>ti is for 
tlie soda] worker to 

(A) be await of his or ber own self, values, and biases 

(B) study the languacc of the minority groap 
(Q be syrapatfaetie and nondiscriminatiag 

(D) live among or dose to the minority-groap members 

Commentary 

The Item assumes that the social worker is not a member of a minonty group This underlying assump- 
tion is invalid, and some of the options are also patronizing 



The fact that black commanlty organlzatioas perceive that economic dcvdopment meets their needs docs 
not by itself justify a federal investment in ecooomic developmeat programs. There are, however, at least two 
important programmatic reasons for cstabCiUng economic devdopment programs with broad-based commu- 
nity support First of all, it is becoming increasingly more difTicult for any federal program to operate in a 
black community without reference to the soda! and political forces within the commcnlty. The chfl rights 
movement and several years of operating commoalty action programs have made t change in the environment 
Black people have become more skilled in the techdqacs of erganization and of communication with the white 
community. As a resuH, It has become tirtnaDy impoasiUe to miplement any meanhigful program without 
active community partidpation. 

The second program reason for comauMity control b dh«dly related to the fact that the social utility of 
economic development hivolves multiple benefits. As long as prcjcrams involve sTngle, separate, quantifiable 
outputs such as total employment total nmnber of booses bdlt etc, a strong case can be made for having the 
ultimate control of the program in the hands of the tccboidans who are better equipped to achieve these goab 
and to optimize the various combinatiotts of cost-benefit rdationsUps. However, community economic devd- 
opment requires that trade-off dedsiona be made involving nonquaatiflable comparisons. Given the fact that 
the state of the art of cost-benefit analyab is, and in the near future will continue to be, nuich too crude to 
permit any semblance of objective cross comparisons of sodal benefits, the question becomes, " iVho should 
decide betweea sodal benefits* If someone has to make these Judgments, ft Is reasonaMe to assume that the 
perception of the community that has to sofTer any mistakes h a better guide than the perception of outside 
professionals who hck both the conceptual framework and the data for rational analysis. 

This does not mean tiiat residents can develop thdr community widiout outdde help. This is partlailariy 
true In programs of business development that by nature Involve complex interrelationships between peoi^Ie, 
require considerable technical competence, and presume a certain common frame of reference among partid- 
pants. The trick, therefore. Is to find the comMnation of community control and technical capability that will 
pmduce responsive polidcs and competent programs. In effect, there would be a partnership between black 
institutions and tiie establishment witii government equipping tiie htsHtutioos with tiie fiscal base for negotia- 
tion. 



(3) INAPPROPRIATE UNDERLYING ASSUMPTIONS 



Example 



Example 



7 




263 



Commentary 

This passage basically scls up d "wc-thcm" situation, with ihc *'n\c." Nvho are knowledgeable and 
technically skillful, proposing to help the "them," who cannot provide any of the knowledge or skills 
required out of thar own resources The author is not unaware of the need for community involvement and 
cooperation or the skills that the community Will be aNe to provide. However, the author assumes that the 
entire community will be ignorant of the knowledge required and defiaent in the skills needed This under- 
lying assumption is what leads to the affectively negative aspects of the passage 

Statements like *Black people have become more skilled in the techniques of communicating with 
the white community" reinforce the inappropnate underlying assumption It wo uM be more appropnate to 
Suggest that Black and White communities have become more skilled in communicating with each other 
rather than implying that any failures in communication are the responsibility of the Black community 
alone. 



Khni9hcbev*s gift to history is, ■nd always was, himself. Khnishcbev^s greatest qualities, those that 
distinguisbcd him from all other Soviet leaders, were his energy, his cnthusitsm, his confidence in himself and 
in others. It was his prodigal personality, his ability to confess a mistake and reverse himself, his explosive 
nnpredicUbility, that did more than anything else to spring the genie of spontaneity out of the bottle of 
repression in which Stalin had contained the Russian spirit for 30 years. Khnisbcbev was the quintessential 
Russian peasant. He was cunning and sly. He was given to the charming, faotasticil Russian kind of lying 
called rranyo and to extremes, like the muzhik who works hard and then spends days on a drinking spree. 
Coming at the moment of Russian history wben be did, Khrushcbe>*$ great contribution was his confidence in 
the Russian people and his effort to give them confidence in themselves. 

Commentary 

The stereotype in this passage is of Russian peasants, a group not explicitly covered in the guidelines 
However, the passage does present a stereotype, the stereotype is ofTcnsivc. and the guidelines do indicate 
that offensive stereotypes are to be avoiucd The matcnal could be affectively negative for certain members 
of the population taking the test. 



The cartoon above from the early 1960s depicts 

(A) a newly revived tribal dance 

(B) communist acger at American Involvement in Vietnam 

(C) the displeasure of communist leaders over the closing of the Suez Canal 

(D) efforts to increase communist influence in Africa 

(E) the resentment of Mao and Khrushchev at African attempts to mediate the iM-aell-Arab conflict 



(4) STEREOTYPING 



Example 



Example 




8 



264 



Commentary 

The cartoon is olTensive because it stereotypes Afncans as primitive, spear-carrying people in grass 
skirts or leopard skins The Tiguncs are meant to a*present Khrushchev and Mao 

Allhou^ matenal of a satinc nature often raises issues for sensitivity reviewers, it is possible to use 
cartoons and the hke that meet ETS sensitivity review guidebnes The following matenal is acceptable- 



Example 

Questions 46^ are aboo t the foOowii^ cartooa. 




46. Tlic man 00 the bottom in tUs caitooB reynscflts 

(A) the federal govcroment 

(B) a labor mediator 

(C) the consumer 

(D) the farmer 

47. What is the main point of the cmrtooa? 

(A) Labor-management disagrccnicnts often lead to violence. 

(B) The government has no power to stof strikes. 

(C) Farmers have little influence on natioMl poHtics. 

(D) The public is often hurt by labor-auMgcttentdittCFeements. 

48. The way the fighters are drawn anggeststbe artist befieres that 

(A) both labor and management obey tbe mlet la thdr dkagrecments 

(B) both labor and management are powerfU forces 

(C) the government has too much control over labor and industry 

(D) both labor and management want help ia aohiag their disagreements 

Commentary 

The cartoon and the questions following it are acceptable The exaggeration in the cartoon and the 
choices in the items do not present a derogatory picture of any group 



(5) LACK OF BALANCE 



Balance in a lest, for the purpose of sensitivity review, generally involves including items that present 
males and females in approximately equal numbers, showing women as well as men as active partiapanls in 
the world at large, and presenting the contnbutions of members of vanous minonty groups or dcscnbing 
the history and culture of such groups as well as those of the majonty group. 



er|c 



9 ' 



265 



The following group of items illustrates another kind of balance that the sensiUvity reviewer may 
comment on Further, it is important to note that the items appeared in a descnpuve booklet (ait ETS 
publications are subject to a mandatory sensiuvity review) and that the quesuon raised for the sensitivity 
review is not covered directly in the ETS sensitivity review guidehnes. The items arc being used to dcscnbe 
the kinds of items that appear in a humaniues test 

Careful consideration of balance is most appropriate in operational sections/tests made up of discrete 
Items In a reading comprehension pretest made up of only two or three passages, the test assembler may 
have content spcaficauons that do not include minonty/female representation Such pretests cannot be 
challenged for balance 

Examples 

1. Often retd as a cliildrtti^s classic, it b in reality a acatking {ndktiiieat of human meanness and greed. In 
its four books, the LiUiputUos art dcraaged, tke YalMKM obscene. 

The passage above discusses 

(A) Tom Jones {fi) David Copperfield (C) The PUgrim's Progress (D) GuUiver's Traveb 
(E) ASee in Wonderiami 

2. Which of the following deals with tbe bigotry an anguished Black family faces when it attempts to move 
into in alK-White siiburt? 

(A) 0'StiW% Desire UiuUr the Elms 

(B) M0kr*9 DeMth of a Salesman 

(C) VifWXiuiu: A Streetcv Named D^tire 

(D) AXbtt'sWho's Afraid of Virginia Woolf? 

(E) HansbcfTy*s A Raisin in the Sun 

3. Which of the following has as its central theme the idea that wars are mass inunity and that armies are 
madhouses? 

(A) Catch-22 (B) Portnoy's Complaint (C) Lord of the FSes (D) Heart of Darkness 
(E) Vanity Fair 

4. Which of the following b often a symbc ; of new life arising from death? 

(A) A gorgon (B) The mInoUur (C) A rnUccni (D) A griflin (E) Tbe phoenix 

5. Which of the following musical forms is divided Into tbe aectkms: Kyrie, Gloria, Credo, Sanctus, Bene- 
dictos, Agnus Dei? 

(A) A symphony (B) A piano concerto (O A mass (D) A madrigal (E) AcaaUta 




1. Tbe work pictured above is 

(A) a fresco (B) a stabile (C) a woodcnt (D) an iOomination (E) an etching 

2. The theme of the work is the 

(A) sacrifice of Isaac (B) expulsion from Eden (C) reincarnation of Vishnu 

(D) creation of Adam (E) flight of learns 

3. The work is located in the 

(A) Alhambra (B) SisHne Chapel (C) Parthenon (D) palace at Versailles 

(E) Cathedral of Notre Dame 

10 



266 




TKH ptioting is i vbual lUlusiOD to which of the foUowing pictoriii themes? 
(A) The AniHiiididoD (B) The Flight into Egypt (C) The AdoriHon of the Magi 
(D) The Pieti (E) The Descent from the Cross 

Commentary 

The question of balance raised by these items is whether onl> those Ta miliar with Chnstian tradition 
can achieve a good score on the test Chri.uanit> has indeed influenced the music, painting, and other arts 
of Western avilizalion, and testing knowledge of these influences may certainly be appropnatc, depending 
on the purposes of the test The questions the assembler and the test sensitivity reviewer must decide are 

• Do these items indeed reflect the proportion of questions on the test dealing with Chnstian tradi- 
tion? 

• If they do. IS such emphasis on detailed knowledge of Chnstian tradition justifiable'' 



JUXTAPOSITION 



Sometimes two items are acceptable from the sensitivity re\iewer's point of view, but they present a 
problem because they are juxtaposed Juxtaposition can permit an unwelcome and unintended association 
between ideas For example, an item dealing with Black uumcn followed by one dealing with welfare might 
cause some test takers to make an unwarranted association of Black women with welfare reapients Such 
it;ms should be separated. 



JUDGING THE ITEMS 



Nothing involving human relationships j$ ever cut-and-dricd, and reviewing matenals for potential 
offcnsivcncss to particular groups of people is no exception The guidelines for reviewing are just that — 
guidelines They do not indiuitc exactly how every item or passage undergoing sensitivity review is to be 
interpreted or under what circumstances matenal that might be regarded as inappropriate for one test 
becomes acceptable -or at least tolerable for another Because there is leeway for debate about some 
Items and their use in a particular test, sensitivity reviewers arc encouraged to discuss with other sensitivity 
reviewers matenal that they consider potentially offensive 

The need for discussion is particularly apparent when the sensitivity reviewer considers the matenal 
potentially offensive enough to be removed from a test because no amount of rewording will make the 
matenal acceptable Before the sensitivity reviewer embarks on such a course, it is important that he or she 
determine from discussions with other sensitivity reviewers whether the recommendation to remove matenal 
from a test is an idiosyncratic or individual lespoise or the informed response of a group of sensitivity 
reviewers Throughout the process discussion of disputed items is encouraged not only among sensitivity 
reviewers but also between the sensili\it) reviewer and the assembler and among all the parties interested in 
the outcome '-f nc dispute 



II 



267 



Ex&nple 



My grandmother's notorious pugnacity did ?ot co«ltoe itself to the exercise of authority over the neigh- 
boriMod. There was also the defense of her boose «nd her ftinutiire against the imagined encroachments of 
visitors. With my grandmoChert thb was not the gentle and trerouloas protectiveness of certain chronically 
frail people, who infer the fragility of all thbigs froa the brittleoess of their own bones and bear the crash of 
mortality in the perikms tinkling of a tea cup. No, wy grandmolhef^s aeotiment was more autocratic: She 
hated having her chain sat in or her lawns stepped o« or the water toroed on in her sinks, for do reason but 
p«ire administrative efficiency; she even grudged the maifaBan Us daily promenade up her sidewalk. Her home 
was a center of power, and she would not aOow it to be insulted by easy or democratic usage. Under her 
jtakws eye, its social properties had withered and it fdactioMd in the family structure simply as a political 
headquarters. Family conferences were held there, consnitatiotts with the doctor and the deigy; unruly grand- 
children were brought there for a lecture or an iatenral of thought-taking; wilb were read and loans negoti- 
ated. The family had no fnends, and entertaniag was heM to be a foolish and unnecessary courtesy required 
only by the bonds of a blood relationship. Holiday dinners fell, as a duty, on the lesser members of the 
organization: Sons and daughters and cousins rca^ectfoUy offered up Baked Alaska on a platter, while my 
grandparents sat enthroned at the table, and only their digestive processes acknowledged the festal nature of 
the day. 

Commentary 

Some test sensitivity reviewers thought this passage inappropnatc because it dcscnbcs an unpleasant 
woman Others had no objection to the passage, because U was obviously dcscnbmg an individual woman 
and not all grar.dir>oiher$ 

Presented to a panel of experts in literature assembled at ETS to discuss sensitivity issues in the testing 
of literature, 'he passage was approved The cruaal factor for the panel was that the person described in (he 
passage is obviously a character of considerable individuality and not stereotypical m any way 



The Mescalero Apache tribe is one of seven lingnistically and culturally related peoples whose aboriginal 
territories stretched over large sections of present-day southwestern United States and northeastern Mexico. 
The Mescalero were characterized by an economic system that harmonized well with their challenging envi- 
ronment. In late historic times they attempted dcsoltory farmmg along some watercourses, but the severe 
weather and short growing season of die mouatafais and the precarious water supply of the lowlands did not 
encourage cultivation of the soil. Thus the Mescalero were forced to depend on hunting and the gathering of 
wild harvests. 

Such an economy required mobility; there had to be readiness to follow the food harvests when and where 
they matured and to move from one hunting area to another when the supply of game dwindled. A concentra- 
tion of population was inappropriate to such techniques of food procurement. As a result, the population wis 
thinly dispersed over the immense range. 

Since most economic errands were carried out hi smaD groups, there was little incentive for highly 
centralized leadership. It Is probable that never in its history did the tribe have a shigle leader who was 
recognized and followed by all. Ratiicr, tiie Mescalero leader, or ''chief* (literally "he who speaks"), was, as 
his title suggests, a respected adviser drawn from the beads of the families who tended to camp and move 
together. 

Since he had no coercive power, he had to understand what hb followers were willing to do. Serious 
misjudsments or unpopular counsel might cost him hb position or a portion of hb foUowers. Theoretically, the 
office of the leader was not hereditary; in practice, there was a tendency for sons of leaders to succeed their 
fatiiers. This was informal, however, not absohite. Typical situations which required a leader's judgment 
included such problems as whether to move to another site because of poor luck in hunting, repeated deaths, 
epidemic disease, or the proximity of enemies; whether to sanction a raid or war party; whether to sponsor an 
important social or ritual event to which outsiders might be invited; and what to do about disruptive behavior 
such as the practice of witchcraft. The ability to lead successful raids and war parties, as well as to sanction 
titem, was a great asset for a leader; such expeditions meant booty, and thb made it possible to distribute 
favors widely. In a society where ^^inerosity was one of the cardinal virtues, such activity built and sustained 
Hie good will so important to a leader. 



Example 



12 





268 



Commentary 

The main objection to this passage, m discus&ions among scnsiti\it> rcvic\scrs. was that »t makes no 
mention of women Further, some of the language was considered demeaaing. for example, the phrase 
"dcsu'tory farming" Others who reviewed the passage had no objection either to the failure to mention 
women or to the language of the passage They held that the society being depicted was a male-dominated 
soaety, that women in that society were subjugated, and that it might be more insulting to women to 
describe that subjugation than to omit mention of women entirely 1 hey had no criticism of the use of the 
word desultory, considering that a nomadic people would not farm in any other way The basic argument of 
these revietvers was that the Mcscalero Apaches lived a life difTerent from that of modern inhabitants of 
North America and that failure to describe that life accurately was, m a sense, an admission that that way 
of life was to be regarded as not so much just difTerent but inferior The counterargument was that not all 
readers of the passage would be knowledgeable about various kinds of societies and would tend to view as 
inferior a soaety quite different from what they considered the best or the norm 

A panel of historians was invited to ETS to discuss with slafT various issues that had caused concern 
among both test assemblers and sensitivity reviewers In its review of the materials presented to it, this panel 
made a clear distinction between material that it considered suitable for history tci.ts and material that 
would be appropriate in reading passages The view of panel members was thai no subject of importance is 
to be avojded in a test designed to be taken by students of history, although care should be used in the 
presentation of material Merely including the date or source of an opinion would suffice for some material 
They stated that history was not always pleasant and thai unpleasant aspects of history should be studied 
and knowledge of those areas of history should be tested They also maintained, however, that considerable 
arcumspcctjon was needed in choosing passages about minority groups and the history of minority groups 
for use in reading tests. Unlike history students, takers of such tests cannot be exp^ted to supply or 
understand a context for a particular idea that might be potentially ofTensive or disturbing 

The historian who discussed the passage about the Mescalero Apaches with ETS stafT, an American 
Indian himself, asserted that descriptions of members of the various Native American nations and tnbcs in 
tests and other materials contain three major faults' 

(1) Native Americans are dealt with as peoples of the past Very little attention is paid to the American 
Indians living, working, accomplishing today 

(2) They are defined in two ways only — as lovers of nature or a^ fierce warriors (He could not decide, 
he said, which Stereotype he di^'iked more.) 

(3) In most materials. American Indians appear to have lived m ,i society without women 

Given these stereotypes, the passage on the Mesralcro Apaches is to be considered inappropriate for 
use in a test 



Example 

GtOTfSt Bernard Shaw explicitly ad»1s€s women to be selfish. Of course, his pl«y Major Barbara reminds 
u$ that selfishness Is not for females only. In Undcrshaft, the munitions millionaire, selfishness is bolder in 
outline and broader in scope than anything Dorothea or Ibsen's Nora or Undershafl's daughter could achicte. 
Undershaft will see the world blown apart by his munitioits before he yill submit himself to the degradations 
of poverty. None of the women quite reaches this pinnacle of assertion. Yet the actual pattern is not difTerent; 
and however small the framework, however delicate the tracing, the quality of selfishness in women needs to 
be emphasized just because it is so difficult to achiete. That remark may sound dubious to readers who know 
selfish women in life and literature; but these arc examples of petty selfishness, not grand selfishness, and of 
the old'fashioned, not the Shavian new vvoman. 

The grand selfishness Shaw recommends is not self- serving, but self-respecting; it does not result in petty 
self-seeking, but in a rehabilitation of the idea of the self. Selfishness is the opposite of meekness, humility, 
and self-sacnlice (the so-called womanly virtues), not the opposite of generosity and altruism (the virtues of 
^strength). In a badly arranged world, meekness aiul acquiescence are dangerous virtues. 

In fact, two pieces of spiritual advice sum up (his little book and could equally well stand for the whole of 
Shaw's ad«cc to women. The first Is the Johnsonian, "My dear friend, clear your mind of cant." The second, 
a comment with reverberations, is: "Always hate the highest respect for yourself, and you will be too proud to 
act badly." 

13 



ERIC 



f - 



269 



WHetlicr diis qoaUty b CftHH •eir>Frsrett, pri^ or egotism 
wectfi \n womcfl or for womo to accept im th M M ch rc * . AccM^iig to LM THNiaf, it ii llib ^Mlity ia Ibc 
heroine of Jtne Aiisl««rs Emma that tfamyt Iht critics m4 iirtro*Kcs u cqvirocal aotc hrto Kirfr JarffMeaO 
of the novel. It is thb Act b distiisteM ia tlw beroiM of ChaHotte Irwte's Skiriey, wIm ii i pliilbic 
ancestor of Shaw's Lyii* Cum: a stif.wi9c^ cfeMcal woMaa wko defies the world to ceoawe her, who 
reels SMTt th*t "f,Uat C¥er?l»<v know*" ia wroi^ aai ikat her own nMXMventiowil riewi are richt One 
Measure of th^ chance ui sods! atmos^ihere bciweea wbca Skirky was ynbHihcJ, and IW, when Shaw 
completed Cmsktt Byron's Profissiom, b Sldrte;*a aaaertioa of her right to certala mascaline prertgatives, 
which9Jie9Hr-<onacio«9lyp«niie$asaaciidf9i(8elf.Lj«a,Mitheothei*band, treats her femWne scKhmmt- 
tion qalte abaentnMedly, with the nuia Ime of her attcatiMi fooncd on the oh^ects to wWch this aonenion 
adaUts her: iateresting stndies and rationaKicd mics of hehnvior. 

39. According to Ihenathor, the sctnal exislcMcerseiCafc warn auiy lead some people to condnde that 

(A) all wM»en are aelfiib 

(B) women who are selfish are siMpay ncting cormaMy 

(C) society docs not demand either adfUMicraciffc^neas of women 

(D) self-sacrifice b only one of the pooAle pnttems of action available to women 

(E) it is easy for women to exhibit adfirfmcat 

40. The an4hor uses the example of Undcrahefl to 

(A) show ShawS tendency to exagserate 

(B) contrast Shaw*s characters wHh noen'a 

(Q demonstrate the reaUsm of Shaw's charactcfizatiow 

(D) present a contrasting model fox Skaw*a wori 

(E) rebut Shaw*s primary contention 

41. The anthor meets an anticipated objection to the sUtemeirt that it is hard for women to beha>e selfishly 



(A) presenting an example 

(B) making a distinction 

(C) dtiog an authority 

(D) describing historical conditioM 

(E) examining literary attitvdcs 

42. Which of the*>llowing provided the clearest inotance of the sHfishoeas that Shaw recommends? 

(A) A woman who allows others to work to s n p psrt he; hnt docs not hdp to provide for her family's 
needs 

(8) A woman who makes other memhm of her hmrndboM defer to her preferences 

(C) A woman who insists on training fSor nnd practtdng a professioa that ahe chooses 

(D) A woman who marries for social ndraace aad does not attempt to make her husband happy 

(E) A woman who obtains the largest part of an taheritance by flattering a weahfay grandparent 

43. As described by Uonel 1>ilUng, the reapoaae of critics to Jane Auslen^s Emma was influenced by the fact 
that they 

(A) found it hard to accept the character of aa aaaettiTe woman 

(B) did not understand the traditioa in which the novel stood 

(C) failed to appredate the subtlety of Jane Aastca's characterizatiott 

(D) thought it unsuitable for wom«n to write aovcfe 

(E) found the novel to be ambiguous in its vahMs 

44. It can be Inferred that the quaUties of Charlotte Bronte's heroine in ShirUy are ''distasteful" to 

(A) the author of the passage 

(B) many feminists 

(Q many readers of the no>el 
(D) George Bernard Shaw 

(C) Lionel IVilUng 



14 



ERIC 




270 



hS. It can hi. inferred that I.>dia Canw is which of th« following''* 

(A) V nmetecnth*centur) no>cH^t 

(B) V V i^rcer woman in Shaw's .\faJor Barbara 

(C ) V iMlitical (Tgure 

(D) ntember of a K^oup that agitated for women's lights 
(t ) V iharacter in Cashel Byron's Profession 

K can be inferred that the author titws the lessening, bct^un the times of Shirtn and L^dia, in the 
seif^consciousness of women who asserted themselves as 

(A) .1 c^tiisc of deterioration in sexual relatioash*ps 

(B) ^.1 adoption of lady like beha>ior 

(C) setback for societ) 

(D) progress matle by women 

{V > \ h^mW of women's renunciation of egotistic beha>ior 



C ommentary 

The con:>cni>us of scnsiti\it> rcvicv\ers who looked at Ihis passage \v.»> ihai the> vkv)uld approve Us use 
in a lest They also rccogni«d I hat some scnsitivit> reviewers might objal lo the passage and to items 39 
and 42 particularly However, the gruup of reviewers appruvm? the passage deemed that a does nut depici 
women either negattvel> or stereo typically and that items 39 and 42 were a^LCptable in thai each was being 
used to define selfishness, a crucial point in understanding the passage and the attitude presented 



The days between Christmas Day and New Year's were allowed the sla>es as holidays. During 
these days all regular work was «itspended. and there was nothing to do b'jt keep (Ires and look after the 
stock. We regarded this time as our own by the grace of our masters, and >«e therefore used it or abtised 
It as we pleased. The holidays nere tariously spent. The sober, industrious ones would employ them* 
(5) selves in manufacturing corn-brooms, mats. horseKrollars. and baskets, and some of these were very well 
made. Another class spent their time in hunting opossums, coons, rabbii^, and other game. But the 
majority spent the holidays in sports, ball'playing. wrestling, boxing. ruu::«ng, foot-races, dancing, and 
drinking whisky; and this latter mode was generally most agreeable to their masters. A slave who would 
work during the holidays wa« thought by his master ur^cserving of holidays. There was in tins simple 
(10) act of continued work an accusation against slaves, and a slave could not help thinking that if he made 
three dollars during the holidays he might make three hundred during the year. Not to be drunk during 
the holidays was disgraceful. 

1. \Vh) was "this latter mode... most agreeable to their masters" (lines 8)? 

(A) It permuted the slaves to return to their work with renewed vigor and interest. 

(B) It invigorated the entire plantation with a Sfurit of well*being and cooperation. 

(C) It put to use materials and assets that were dilTicult to sell on Ihe open market. 

(D) It appeared to provide a necessary break in a life of continuous labor and service 

(E) It seemed to confirm the slave owner's Ulief that slaves were not interested in living as industrious 
freemen. 

2. The tone of the las! sentence )s 

(A) ironie and hitter 

(B) jovial and hilarious 

(C) pedantic am) learned 

(D) servile and cooperative 

(E) cajoling and pleading 



Example 




ERLC 



271 



3. The passage !s from an autobiographicftl tccouot by 

(A) James Baldwin 

(B) URoi Jones 

(C) Frederick DuukIiss 

(D) Rkhard Wright 

(E) Ralph Ellison 

Commentary 

This passage raises several questions about its appropnateness in a literature test 

• The {)assage is about slavery, a highly emotional subject for some test takers 

• The pa^gc is written by Frcdenck Douglass, an important figure in Black history 

• The passage presents a picture of the kind of subtle influences a master used to control the behavior of 
slaves 

• The passage spcafically mentions drunkenness among slaves 

Con^idenng these issues, some reviewers would deem the passage inappropnate because, although a 
faaually accurate account, it is alTccf ively negative Others would consider the passage appropnate to use. 
in accordance with the views expressed by the panel of histonans. in a lest designed for history students 
Resolving the issue of whether the passage should be included in a gjven lest will require considerable 
discussion amon*» an extended group of people, all involved cither in the development of the test or m the 
sen«>uvity review process No n. w hat iheir decision about the passage, however, the items contain 
options that arc inappropnate Ti .est taker who chooses to answer the first item with option B. for 
example, has clearly misread the passage or has some insupportable ideas about slavery. This Ic^l laker, 
however, has no opportunity to discover that B is an incorrect answer It is to avoid retnforang such ideas 
in the minds of those who choose the wrong answer that options like B arc to be removed from test items, 
in accordance with the ETS sensitivity review guidelines Rev/ording other options in the item, particularly 
A and E. will improve the item from the sensitivity reviewer's point of view Option E» for example, should 
have a word like erroneous or mistaken inserted before belief, and freemen should be revised Similar 
changes in wording arc called for in item 2 Options B, O. and E convey an impression of the wnter's 
attitude that is not appropnate 



Questions 3, 4, and 5 refer to the following excerpt fron a United States Supreme Court devision. 
^^That woman*s pfl>sical strv^jre and the pcrfonnaAce of maternal functiom place ber at a disadvantage In 
the struggle for subsist^r^e Is obvious. This is especially tme when the hardens of motherhood are upon her. 
Even when they are not, by abundant testimoay of the mtdkml fraternity, coadouance for a long time on ber 
feet at work, repeating this from da> to day, tends to iajnrioiis effects upon the body, and« as healthy mothers 
are essential to vigorous ofrs»ring« the physical weU4»eiBg of woman becomes an object of public interest and 
care in order to preserve the strength and vigor of the race.** 

3. The views expressed In the excerpt above most proilttMy supported legislation 

(A) reguMcing tlic hours and working conditions of women 

(B) protiiolting the emplo>ment of women in speciricd industries 

(C) providing medical clinics for women in specified industries 

(D) encouraging the use of birth-control techniques 

(E) permitting health insurance companies to charge higher rates to women emplo>ed in specified mdus- 



Example 



tries 



16 





272 



4. In arriWng at these \iews, the Supreme Court 

(A) followed a strict constructionJst line 

(B) held as closel) ts possible to precedent 

(C) admitted the legal relevance of statistical, sociological, and historical data 

(D) paid cloic attention to the intent of the legislature 

(E) followed the Dillon rule 

5. The >iews expressed m »' . txcerpt abote most dosch reflected iho>c of contemporarv 

(A) socialists 

(B) feminists 

(C) Progressi* s 

(D) etigenicis s 

(E) Democri s 

Commentary 

From the poin' of view of the <vnsiti\ii> re\icN\cr, this stnnulus is unacceptable for use in an TTS test 
It restricts wontr to thcir traditional roles ,is mothers and and mak-s no pro\ision for the student 
x*ho. unaware jU^c source of the quotatio,i. accepts the vic\\ presented as,an accurate one espoused by 
ETS Further, the head note, b> pointing to the Supreme Court as ihe suuru- of ihe quotation, gt\cs stature 
to the point of view expressed 

from the pomt of view of the histcnan, the cited decision is a t rucial one in United Stales history, in 
that It represented a departure m the method by which the Supreme Court justified the decision (as question 
4 indicates) It is also important in thf history of women's nghts bec.iuse n defined the legal status of 
women (a woman is not a person under the law) and led to legali/ed discrimination against women, but 
e\enlually ga\e impetus to the movement gi\e women equal nghts b) constitutional amendment Further, 
the decision upheld protectee labor legislation for women that presided for such thmgs ,is stools for 
saleswomen to sit on so that the> would not have to stand for 12 hours a da> Humane treatment for 
women workers eventually led to humane treatment for all workers, and the decision is therefore crucial in 
the history cf 'he Amencan labor movement 

When two such equally importan' issues are at stake a \iew of women that might upset some candh 
d<>tes and reinforce stereotypical thmktrg in o:hers \ersus a need to preser\e the mtcgrit) of the suHjcct 
matter of Amencan history, unpleasant though u ma> be the need for compromise i!> apparent In this 
case, a bncf statement identifying for the t.^st taker the date of and histoncal context for the decision will 
make the quotation acceptable m a histon tvsi Such a compromise might have Imle elTect in some other 
situations on items related to the stimulus, the inclusion of a date definitely changes the nature of the task 
required by the questions, particular^ question 5 Th.it question, from the test assembler s point of viev , 
may ha\e to be omitted from the test ^'hcn the constraints of the scnsiuvit) guidelines and the constr,iints 
of the pool of items available to meet test content spctitKations cause such opposing ensions, working out 
a mutually acceptable compromise is ne\er easy, but it can iisuall) bo done 



17 



ERIC 



273 



Mr. Edwards. Thank you very much, Dr. Dwyer, and Miss Rigol. 

There is a vote on the floor of the House of Representatives. We 
vi^ill recess for about ten minutes. 

Mrs. ScHROEDER. Mr. Chairman, could I just say one thing? Tm 
not going to be able to come back because I have to v/ork on this 
bill. But I just v/ant to say that Fm very disappointed because, 
y/hen I hear you saying that v/or:en are more accurately predicted 
by the tests than men, then you're really saying we're overpredict- 
ing men. That is the whole basis of what the first panel was 
saying— why are we overpredicting men? 

The second thing that disturbs me is that you are telling us that 
there is this new group of women taking the test— but not to 
worry, that they're from a lower socio-economic scale I don't think 
that's relevant. I think it is what kind of academic backgrounds 
they have. My understanding is the new group of women, and the 
fact that more are taking the tests, is because they're being encour- 
aged to by their advisors because of their academic performance. 
So that's what I think is relevant. 

If they are really academically much lower than the males 
taking the tests from high school, then that's different. But I don't 
care about their socioeconomic background as much as I do their 
academic background. If you could address those two things for the 
record, I want the statistics on what kind of academic background 
they have, not their income background. 

That is the second bell and we do have to run. I'm sorry, Mr. 
Chairman. 

Mr. Edwards. That's all right. We have plenty of time. 

Dr. Dwyer. Let me just say— and Miss Rigol may certainly want 
to comment, too— that the women who take the SAT now are less 
likely than the men, to have taken the academic program in high 
school, for example. I think that speaks to the level of their aca- 
demic preparation. We all have a sense of what the academic pro- 
gram is. There are more women than men, who haven't taken that 
program, who take the SAT. 

[Whereupon, the subcommittee was in recess.] 

Mr. Edwards. The subcommittee will come to order. 

Well, Dr. Dwyer and Miss Rigol, is it your testimony that your 
testing products, your examinations, are just about as good as they 
can be made, insofar as bias is concerned, that they are very fair? 

Ms. Rigol. I believe they are. I don't believe that we should be 
complacent and say we're never going to look again and they are 
perfect, just to put them on the shelf. I think we have to continue 
to evaluate them. 

But based on what we know so far, I believe they're as fair as 
they possibly can be and we'll just continue to work at it. That s 
my response. 

Dr. Dwyer. I would agree with that. I would also be immodest 
and say that I think the test development procedures used at ETS 
set a standard for other test developing organizations. But I would 
get a stage more nitty-gritty than that and say that not only do we 
not have to become complacent, but we make new tests every day. 
We have to apply the procedures that we do have every day. In 
doing so, I think we continually learn better ways to do it. 



ERLC 



277 



274 



The test sensitivity review standards that I entered into the 
record, for example, just within the past couple of months were 
completely revised because we felt that our reviewing experience 
had taught us so mujh about how to do that process that we 
needed to rethink it. 

Mr. Edwards. So what you re saying is there was some bias in 
the past, but you think you have eliminated it? 

Ms. RiGOL. I would like to respond. 

I think that society has changed. There are things that would 
have been considered perfectly acceptable by the vast majority of 
us 20 years ago that grate on us now. In many cases, these are 
subtle changes. Looking back over old SAT items, I notice things— 
for example, we might have had a math question about how many 
shirts can a woman iron in three hours. Well, that is just not re- 
flecting real life situations any more— for nr.any women, at any 
rate. Those kinds of questions have been changed. 

So, obviously, we have to keep updating 

Mr. Edwards. So that's obviously bias? 

Ms. RiGOL. If we had that in there now. Tm not sure if it was 
biased 20 yeai-s ago or not. But certainly our perceptions of 
what 

Mr. Edwards. Well, we had a pretty biased world 20 years ago, 
so far as women were concerned, so you certainly do know it was 
biased. Of course, it was. Twenty years ago we weren't interested 
m having women involve themselves to the extent that they are in- 
volved in American life, so that was bias. 

Ms. RiGOL. That would be bias. 

Mr. Edwards. How do you look for bias? How do you find it? 

Dr. DwYER. I start off by, first of all, looking at the individual 
test questions m terms of what it is they're supposed to be asking. I 
realize this sounds like Tm starting a long way back. But I think 
the kind of items that invite different interpretations from men 
and women or from blacks and whites are poorl> written or confus- 
ing items. So I think a good safeguard against bias is to make sure 
that you know exactly what educational point you intend to test 
and checking very carefully to make sure that it doesn't drag in a 
lot of superfluous information that might be related to factors like 
race and sex. 

The other things that I mentioned earlier get at very specific 
kinds of bias, the bias that can be introduced by having a person 
read something in a testing situation that they find upsetting or 
offensive or that triggers in their mind some response that is just 
not productive and not related to the educational goals of the test- 
ing. 

Mr. Edwards. Insofar as math is concerned, young women 
coming from rigorous schools, prep schools— for example, Concord 
Academy or places like that— do they do just as well as young men 
coming from rigorous prep schools? 

Ms. RiGOL. I don't know offhand, but I would be glad to gather 
the information for a group of some selective and rigorous inde- 
pendent schools and compare men's and women*s scores. I can pro- 
vide that to the subcommittee. 

Dr. DwYER. May I be permitted a personal anecdote? 

Mr. Edwards. Yes, please. 



275 



Dr. DwYER. My daughter is a college student, a mathematics 
major, and was shocked to find, when she looked into the number 
of graduating math majors at Harvard, in the year she was looking 
at, I believe there were ten people in math, of whom only two were 
women. That is not the overall sex ratio there. 

Mr. Edwards. Thank you. 

Ms. LeRoy. 

Ms. LeRoy. Are there not studies that control for these factors, 
that show that when you discount for socioeconomic status and 
when you discount for educational background, when you take girls 
and boys who have had the same number of years of math and 
come from the same kinds of backgrounds, that girls still perform 
worse than boys on these tests? 

Ms. RiGOL. No, I think the data is just the opposite, that when 
you adjust, or at least take into account, the academic preparation 
of the groups, that the gaps are not nearly as wide. 

Ms. LeRoy. But they're still there. 

Ms. RiGOL. I again do not have that data right in front of me, but 
I do know that, generally, when you control for number of years of 
academic study, you do account for a lai^e proportion of the differ- 
ence. 

Ms. LeRoy. ETS has done those studies, is that what you're 
saying, and that that's your understanding of their results? 

Ms. RiGOL. The College Board has 

Ms. LeRoy. Tm sorry, I meant the College Board. 

Ms. RiGOL [continuing]. Has displayed the information on a 
number of years of study in a number of publications, and will be 
coming out with a series of reports this next fall that will show, by 
specific course preparation, the scores for different groups. It does 
show that it accounts for at least some— I don't now if it is all— of 
the gap. 

Ms. LeRoy. We would be interested in seeing those studies when 
they're done. 

Ms. RiGOL. Yes, I can certainly provide that information, too. 

Ms. LeRoy. Are there some types of questions, math questions, 
for example, that you know women do as well on or better than 
men, and are the:-e some types of questions that you know that 
men do better on than women? And what is the ratio of those types 
of questions within the SAT? 

I mean, have either of your organizations thought about includ- 
mg more of one type of question to try to adjust the test score dif- 
ferential? 

Dr. DwYER. Let me try to address that. 

In most testing situations, you don't stait with the premise that 
you want to construct a test that makes groups equal. Ycu start 
with an educational premise of some sort. I think, as a generaliza- 
tion, across a number of tests, you would be looking at either the 
distribution of courses that material is to be based on, or within a 
course of the topics, to be covered. The specifications for the test, 
the determination of how many of what kind of questions to be in- 
cluded, should follow from those educational considerations rather 
than fr om score difference considerations. 

There have been and the earliest study I can quickly think of is 
maybe 1958, people doing research with very small samples of 



ERJC 



276 



items and people, that look at particular variations of questions— 
for example, setting a question in a male context and then in a 
female context, and seeing what the sex differences are. There 
have been some mixed findings on that, and I think part of that is 
a disagreement about what ought to be considered ''male" and 
what ought to be considered ''female/' One of the original studies 
called a question about sev^lng the 'Temale" version, and a ques- 
tion about a snail crawling up the wall the ''male" version. That 
makes us thmk it's a little hard to interpret. 

We have generally observed that females tend to score lower 
than males in math and science areas and higher in humanities 
and the arts, when other things are equal. Again, I'm speaking 
very broadly over a lot of different kind of data. 

Ms. LeRoy. What about within the math area? That's really the 
focus of my question. For example, data sufficiency questions as op- 
posed to spatial relationship questions, one group does better than 
the other, right? 

Dr. DwYER. Yes. There is a large body of psych :)logical literature 
that examines the question of sex differences in spatial relations. 
That is probably a much better researched area than any of the 
others that Fve spoken of. 

Ms. LeRoy. And do the College Board and ETS look at those tests 
to see how those questions—you know, what percentage of the test 
is made up of different types of questions? 

Dr. DwYER. Oh, yes. For all the tests that we produce, we begin 
with a set of test specifications that describe how many of what 
kind of items go into the test. 

Ms. LeRoy. And how do you come out on those kinds of— I mean, 
what is the ratio? Do you have those statistics? 

Dr. DwYER. Let's see. You know, I can picture a chart in my 
mind this morning, and Fm not sure if Fm going to be able to read 
it from my mental image. 

Fm not going to be able to do that accurately. 

Ms. LsRoY. Well, could you provide that information to the sub- 
committee, if it's available, at a later time? 

Dr. DwYER. You're thinking about tlie percentage of 

Ms. LeRoy. The types of questions that you know that there are 
sex differences on in terms of the correctness of the answer or the 
number of correct answers. Women answer certain kinds of ques- 
tions—have an easier time with certain kinds of questions, and 
men have an easier time with other kinds of questions. How are 
the tests made up in terms of how much of each of those types of 
questions they include? 

That's a very unscientific description of what Fm getting at. 

Dr. DwYER. I know what you're getting at. I think what I can say 
is that I can tell you exactiy what's on the test, and I can show you 
also the research that speake to differences between men and 
women on things that are like what's on the test. But I can't form 
a causal relationship for you. 

Are you talking about the SATs? 

Ms. LeRoy. Yes. 

Dr. Dwyer, in an article that you wrote in lOTG, you said the ETS 
method for deciding the content of the SAT systematically favors 
boys and that— this is a quote— "probably an unconscious form of 



ERIC 



277 



sexism underlies this pattern. When girls show superior perform- 
ance, balancing is required; when boys show superior performance, 
no adjustments are necessary/' 
Have you changed your mind about that? 

Dr. DwYER. I have not changed my mind about the basic state- 
ments that I made at that time, that I believe unconscious sexism 
causes people to accept, v/ithout question, women's inferior per- 
formance. I believe that my comments, and those of many other 
people in the same vein, have led to a number of differences in the 
way that we address bias at ETS and elsewhere. 

I would also like to say that, at the time that I wrote that, there 
was virtually no interest in women's mathematical scores, and that 
situation has since been dramatically reversed. In the research 
field, and not just at ETS, but throughout educational research, 
that is a topic of intense interest these days. 

Mr. Edwards. Counsel? 

Mr. Slobodin. Thank you, Mr. Chairman. 

Democratic counsel was interested in a study on the question of 
math performance and what percentage of that is accounted for by 
the preparatory aspect of it. 

Fronri the study I had quoted earlier, which is quoted in Miss 
Rosser's report^or is cited as a source in her report— the same 
study, it says "Based on a more broadly representative sample, 
their study suggests—" this is a Pallas and Alexander 1983 study 
'^suggests the male/female gap in SAT mathematical perform- 
ance shrinks considerably when differences in quantitative hieh 
school course work is Laken into consideration. They report that 
when differences in quantitative course work were controlled in 
their analyses, the male/female gap in SAT mathematical perform- 
ance decreased from 35 points to 14 points." 

Then further on down it says, "A recent study using data drawn 
from the 1977 through 1978 National Assessment of Education 
Progress m Mathematics found that 25 percent of the variance in 
mathematics achievement was accounted foi by background varia- 
bles, such as characteristics of the community and educational 
level of parents, while the number of semesters of mathematics 
studied explained an additional 34 percent of the variance." 

So I would ask the panelists, putting that data together, what 
gap do we have at this point? First of all, are you familiar with 
those studies, and what kinc' of gap do we have at this pomt, once 
you take into account the level of preparation and other kinds of 
background variables that are in that study? 

Dr. DwYER. The most significant gap to me is the fact that we 
have those differences in the level of preparation and participation 
in mathematics— and I want to throw in science there, too, as a 
plug. I mean, I thmk that's where the really prob!cmHti<^ issue is 
that girls eithe? through socialization are opting out ot that 
stream, or, if they had any original interest in it, they are beins 
funneled away /rom it. 

Since ycu ara interested in that topic, I should probably mention 
that there is r.nother body of research that speaks to that point. I 
think the most prominent .jsearcher there is Dr. Elizabeth Fen- 
nenrta, who has btadied sex uifferencca in mathematics extensively, 
beginning around the mid-seventies. Her work looks at, among 




7A-668 0-89-10 



2: 



278 

other things, differential treatment and reinforcement of men and 
women withm their math classes. Her thesis is that it is not just 
the taking of the math course itself but what happens to you once 
you get m that course. That is son.ething else that should be con- 
sidered. 

Mr Slobodin. Let's break things down a bit, because when you 
break down under certain groups, you can get certain disparities, 
and then you look at it differently. Either the disparity grows or it 
gets smaller. 

Is there a disparity, by sex, in terms of the high test score per- 
tormance.^ Have you seen a difference in the decrease in test scores 
above 600 between male and female, looking at it at that break- 
down? 

Ms RiGOL. There have been shifts, and unfortunately I cannot 
recall that data right i w. I would like to send that, also 

Mr Slobodin. How about below 300, looking at it from the same 
standpoint. 

I would also \yant to follow up on the majority counsel's request, 
that when you look at the types of questions where there's a real 
disparity between the correct response rate between women and 
men, that you also look at race. Because I'm wondering, as you try 
to neutralize, could it possibly be we're playing with a Rubik's cube 
here^ We may solve the problem of eliminating a question that has 
a disparity in terms of sex, but it may actually increase the prob- 
lem in terms of race. 

Dr. DwYER. Your analogy of the Rubik's cube is a very apt one, 
from where I sit. 

fkH""u^^ problems that tht bamnciug quebtion encounters is 
that the patterns do not run the same way in sex comparisons and 
in race comparisons.^ Additionally, they do not necessarily run the 
same way when you're doing various minority group comparisons. 
It IS sort of hard to know what to do when you have a question 
Blacks and disfavors Hispanics, for example. 

Mr. Slobodin. OK. And just on this under- and overprediction 
issup, IS It your opinion that i^ there were separate sex equations in 
terms of the predictions of fn.t-year academic performance, that it 
■^ouid not only eliminate the underprediction of females but the 
overprediction for males? 

Dr. DwYER. Yes. Separate sex equations, if you're predicting— 
Ihis IS such a hard topic to talk about, it's so technical. I think, 
though. It would be fair, given my level of expertise on this subject, 
tor me to say that when you have separate sex equations, you vir- 
tually eliminate any question of under- or overprediction on the av- 
erage I mean, you have set up a target here and you're sending an 
arrow straight for it, rather than trying to make a compromise be- 
tween the two targets tiiat are far apart. 

Mr. Slobodin. Just one lact question. On the underprediction, I 
am looking here at this table. They break it down by majors It 
happens that this table looked at— il-v conclude, first of all, look- 
ing at It as a whole, the difference ratio<=; observed in this study 
must be considered very small. But in the dib^^ioline studies, where 
r a'^ differences, the highest underprediction^ was 28 percent 
01 d UFA standard deviation in an engineering prograi*: made up 5 
to 1 of men. ^ 



ERLC 



279 



Do you have a feel that in courses a lot of these things are 
skewed as a result of this kind of thing? For example, I would be 
interested to see what would happen if you took that engineering 
out or some of the extremes and looked at courses where there as 
equal participation rates. There may be something here a' ^rk. 

Have you seen that there's a different interest level, a difference 
based on sex, an interest in things like computers? 

Dr. DwYER. Oh, absolutely. You know, women and men do take 
different courses and have different majors. 

I think it was interesting, that it was mentioned earlier about 
LSAT and its prediction of law school grades. I mean, that's one of 
the few instances where you have essentially a situation where 
men and women take all the same courses in the first year and un- 
derprediction is not a problem in that situation as it is at colleges. 
There you are predicting to everyone's grades together and the 
men and women have t^ken systematically different courses. 

Mr. Slobodin. Thank you. 

Ms. LeRoy. I would like to ask both of you just one question on 
test use. 

Both of your organizations have guidelines and policies for the 
proper use of these— well, let's just talk about the SAT. In fact, one 
0^ the things you submitted for th? record is the ETS Standards for 
v^uality and Fairness. On page 24 it says "ETS will set forth clearly 
to all score recipients principles of proper use of tests," et cetera, et 
cetera; "ETS will establish procedures by which fair and appropri- 
ate test use can be promoted and misuse can be 'liscouraged or 
eliminated." Later on it talks about investigating comolaints or al- 
legations of improper use, to consult with the sponsor to determine 
whether to continue services, et cetera. 

I guess my question is what actually happens in terms of imple- 
menting these guidelines. For example, I think every witness 
here— whatever they feel about these tests— has said they should 
never be used as the sole criterion for any decision making pur- 
pose. I take it you agree with that? 

Dr. DwYER. Yes. 

Ms. RiGOL. Yes. 

Ms. LeRoy. Well, the Empire State scholarships obviously fo- 
cused only on SAT scores. There are certain programs run by 
Johns Hopkins for which I think junior high school children are se- 
lected based on their performance on the SAT math score. From 
what you have said, I would assume that you think those may be 
improper uses of those tests. 

Why do you provide those tests to those people for those purposes 
if they are improper? 

Ms. RiGOL. There are several answers to that, and Til try to be 
brief. 

One is that, indeed, the College Board does not support the use of 
scores alone, unless there is just no other way to do what it is you 
want to do. We certainly do tell all of our users about the guide- 
lines. When we are aware that a use is being made of the scores 
that is not appropriate, regional staff— and the College Board does 
have staff throughout the country to work u ith institutions— they 
generally do talk to the organization, or whatever it is, to try to 
suggest appropriate uses. 



ERIC 2Zo 



280 



There are a number of things that are used in the gifted and tal- 
ented programs. There is documentation that shows, for their ini- 
tial selection purposes, when they want to do a national search and 
try to find as many talented people from all over the country, that 
the use of the SAT does work very well. There are other criteria 
that are taken into consideration in the selection of tb? students 
who participate in those programs. 

I think that part of your question was why do v/e continue to 
provide the tests. Weil, the tests are provided to students and the 
students choose to send their scores where they wish to. And while 
we have debated whether or not we should tell students **yes, you 
may take the SAT, but we will not send your scores where you 
would like to have them sent'' would not be a proper interpretation 
of 

Ms. LeRoy. That^s sort of a ''Catch 22", though, isn^t it? You 
can't get a scholarship in New York unless you take the test, so the 
student says "Fm not going to take the test because I know this is 
an improper use of the test", but it s the only way he's going to get 
the scholarship. 

Ms. RiGOL. In the case of New York, I think there's an even more 
important thing, and that is one related to social values and wheth- 
er the intent of the scholarship program is to recognize scholastic 
ability or whatever factor, regardless of the composition of the pop- 
ulation competing for those awards, or whether the awards should 
be intentionally subdivided between various subpopulations. 

It is not at all unusual— and this is the case in some States and 
scholarship programs— for the scholarship to be actually set up 
where a certain number of scholarships must come from certain 
congressional districts, or you could even say we have to have an 
equal number of men and women, or you could say we want to 
have the scholarships come from a certain proportion of various 
ethnic or monority groups. 

No single test is going to do all of that. You have to determine 
what it IS you want to accomplish and then set the guidelines ac- 
cordingly. 

Ms. LeRoy. Have you ever not provided the test to someone 
based on failure to follow your guidelines? 

Ms. RiGOL. ETS has, yes. I caxi't think of a single case ri^ht now 
for the SAT, but ETS has. 

Dr. DwYEii. Yes. There is something that I think is important to 
know about that. We have refused to provide services where we 
leel that the client using those services is engaged in a misuse of 
the services, and where frankly the client has just proved intracta- 
ble in moving towards a better use. 

Fortunately— and I say fortunately on purpose— those situations 
are few m number. There are many questionable testing activities 
that ETS simply does not engage in from the beginning. But in sit- 
uations where we are already engaged in a testing activity and a 
misuse comes to light, we try very, very hard to correct that 
misuse, rather than to write off the clientG. But when that becomes 
necessary, we do it. 

I think our president has been very forthright in his commit- 
ment to continue maintaining standards that way. It was he who 
has instituted the implementation of our standards, which are peri- 




281 

odically revised. Our programs are periodically audited against 
those standards. We invite visiting panels of distinguished educa- 
tors to come and critique ETS on its adherence to those standards. 

Ms. LeRoy. Thank you. 

Mr. Edwards. Thank you both very much. 

Dr. DwYER. Thank you, sir. 

Mr. Edwards. The last panel today will consist of Mr Michael 
Behnkc, Director of Admissions, Massachusetts Institute of Tech- 
nology, and Dr. Denise Carty-Bennia, Professor of Law at North- 
eastern University, and Executive Chair of Fair Test at Boston. 

Mr. Behnke, you are first. We welcome you both. Your full state- 
ments, of course, will be made a part of the record. You may pro- 
ceed. We apologize for keeping you waiting so long. 

Do you solemnly swear or affirm that the testimony you are 
about to give is the truth, the whole truth, and nothing but the 
truth? 

Ml. Behnke. I do. 

Dr. Carty-Bennia. Yes. 

STxVTElViENTS OF MICHAEL C. HEHNKE, DIRECTOR OF ADMIS- 
SIONS, MASSACHUSETIS INSTITUTE OF TECHNOLOGY; AND 
DENISE CARTY-BENNIA, PROFESSOR OF LAW, NORTHEASTERN 
UNIVERSITY, AND EXECUTIVE CHAIR, FAIR TEST BOSTON, MA 

Mr. Behnke. Thank you, Mr. Chairman. 

I have been asked to discuss how standardized tests are used in 
the admissions process at colleges with competitive admissions and 
to describe research we are doing at the Massachusetts Institute of 
Technology on gender and testing. 

My own experience has been that standardized tests are used in 
a responsible manner in the process for which they were de- 
signed—that is, college admissions. A central question is whether 
students are denied admission to the college of their choice because 
of test scores. The fact is that nio.st students are admitted to their 
first choice college. According to the Higher Education Research 
Institute, 71 percent of freshmen are enrolled in their first choice 
college, 93 percent are enrolled in their first or second choice col- 
lege. 

There are relatively few colleges and universities which deny ad- 
mission to many of their applicants. I have worked in admissions 
at three colleges— Amherst College, Tufts University, and MIT— 
which accept fewer than one-third of their applicants. In these 
cases, it is certainly true that many students are denied the oppor- 
tunity of enrolling in their first choice college. The admission deci- 
sion at these institutions, however, is based on many factors and 
rarely, if ever, is dependent solely on test scores. 

Mr. Edwards. If you don't mind my interrupting, while you'ie on 
that point, you said that 71 percent of applicants get their first 
choice, and yet at MIT and places like that, only a third of them 
do, 33 percent; is that correct? 

When you say 71 percent, >ouVe referring to the entire pool of 
applicants? 



282 



Mr Behnke. I'm referring to the entire pool of applicants in the 
United States. At MIT this year we have admitted about 25 percent 
of our applicants. 

Mr. Edwards. That would apply to Stanford, Harvard and Yale? 

Mr. Behnke. It would be, if anything, a lower percentage. 

I have submitted with my written testimony a profile which de- 
scribes our selection procedure. The process is a complex one and is 
highly subjective. No one has come close to perfecting the art of 
human assessment. We can't measure motivation, curiosity, deter- 
mination, or other similar qualities, but evidence of these qualities 
can carry great weight. Academic achievement is crucial, but we 
are faced with trying to understand the meaning of grades and 
courses from thousands of secondary schools, many of which in 
recent years have abandoned class rank, which makes it very diffi- 
cult for us to judge the quality of grades within a particular school. 

Extracurricular accomplishments are important, but we must 
decide whether a leadership position listed jn the application 
means that someone won a popularity contest with little follow 
through or really served as a positive force for change. 

We also find that advantaged students seem to be able to present 
themselves much more effectively than students without access to 
good counseling or parents who have gone to college. There is a lot 
of room for bias in the presentation of extracurricular accomplish- 
ments. 

This complicated process of judging evidence operates under 
pressures from many quarters, including alumni, coaches, politi- 
cians and donors. In the midst of all this, tests provide a useful 
standardized measure. They are accepted by profeysionals as one 
more very imperfect measure of potential and are used in combina- 
tion with all other pieces of evidence. The result of this is a very 
wide distribution of scores. 

I have to apologize, as I have an incorrect figure in the written 
testimon>. At MIT last year, the number of applicants who scored 
above 750 on the SAT math was actually 2,224, not 1,913, as indi- 
cated in my written testimony. 

Since we admitted only 1,750 applicants, we could have restricted 
ourselves entirely to this group. Instead, we admitted 40 percent of 
them We then admitted 28 percent of those who scored between 
700 and 740; 22 percent of those between 650 and 090, 19 percent of 
those between 600 and 640; 10 percent of those between 550 and 
590; 7 percent of those between 500 and 540; and 3 percent of those 
who scored below 500. 

By saying that tests are used responsibly in the admissions proc- 
ess, I don't mean to imply that there is a settled, widely agreed 
upon way in which tests are used. I mean two things. First, I mean 
that admissions officer^ rarely depend on test scores alone to deny 
someone admission, and second, I mean that admission profession- 
als periodically reexamine their use of test scores, as they do the 
other criteria they use. This reexamination has, in fact, led some 
schools to no longer require test scores or to modify their require- 
ments. 

Admissions professionals are troubled by test abuse. Testing is a 
topic of debate at almost every meeting of admissions professionals. 
All of us, I think, are trying to find out more about them so that 



ERIC 



283 



we can use them responsibly. I sent two of my staff members to 
attend the recent meeting of Fair Test. We are troubled by scholar- 
ship agencies which use score cutoffs, we are troubled by the use of 
score cutoffs for athletic eligibility. We are troubled by the use of 
college entrance exams in identifying gifted students in middle or 
even elementary schools. When most of the students labelled 
**gifted" are white males, what message does this send to young 
women and minorities? We worry about people defining their 
worth and potential in terms of test scores. This is especially trou- 
bling because of race and gender differences in scores. We worry 
about the growing industry of test coaching schools feeding off peo- 
ple's anxieties. V/e ultimately worry that we may lose a useful 
piece of evidence in deciding college admissions because test abuse 
may lead to abandoning or weakening these tests. We hope that 
public scrutiny and pressure and the efforts of testing agencies 
themselves will lead to fewer abuses and more public understand- 
ing. 

In the meantime, there are some things that we can do in admis- 
sions. The first is to communicate to the public how tests are used. 
The profile I have submitted is an attempt to show students and 
counselors the range of test scores so they do not focus on averages. 
I should mention, though, that even the use of that profile is prob- 
lematical. It's been used at MIT for a number of years and it was 
first issued in response to concerns about providinK information to 
consumers. But my student/faculty committee on admissions feels 
that it overemphasizes the use of test scores, as they understand 
how they're being used in the process of which they are a part. 
And while people want the information, my faculty committee feels 
we may even be providing too much information and misleading 
people in the way that I think Mrs. Schroeder pointed out. 

The second thing we can do is research. We have been doing re- 
search at MIT to examine the relationship between college grades 
and an academic prediction formula combining high school grades 
in math and science, rank in class, if available, and SATs and 
achievement test scores. The formula has a scale which extends 
from 1 to 99. We looked at the relationship between thai index and 
grades for a recent entering class, most of whom had graduated. 
MIT has a graduation rate within five years of approximately 85 
percent. The mean academic index on this scale of 1 to 9!) for men 
who entered was 68, and for women it was GO. We looked at what 
made up the difference and the lower average for women reflected 
lower test scores. Specifically, there were statistically significant 
differences in the math SAT, the math achievement test and the 
science achievement test. 

The correlation of that index with grade point average in the 
senior year was 0.47 for both men and women. I think the point 
has been made earlier here today that, in fact, tht test does predict 
grades as well for women as it does for men. We have found that to 
be true. This means that the index is a reasonable predictor of 8- 
term cumulative average for both men and women, that is, low 
scoring men and low scoring women tend to receive lower grades at 
MIT. But the index underpredicts the final grade point average for 
women at all levels. 



ERIC 



284 

Because of the lower index, we would expect women to have a 
lower GPA. In fact, at the end of eight terms, there was no signifi- 
cant difference in grade point average. It is also useful to ncte that 
women have a higher retention rate at MIT, so this is not due to 
women dropping out at a higher rate. 

Various thcuries have been advanced to explam the fact that 
standardized tests underpredict the academic performance of 
women. One is that the result may be due to differences in the se- 
lection of courses by women and men. We looked at each individual 
major at MIT and found the same thing going on within depart- 
ments, so we found no evidence of course selection affecting this at 
MIT. 

We believe the possibility needs to be explored that the tests do 
not really get at what female ability means. In the words of our 
Associate Director of Research, Dr. Elizabeth Johnson, ''What we 
are really suggesting is the notion of multiple models of developed 
abilit>, perhaps even intelligence, which are culturally related and 
which, if accurately drawn, would predict equivalent success on 
real world outcome measures." 

In conclusion, I think it is important to note that we were able to 
do this research at MIT because for man> years MIT has admitted 
women with somewhat lower test scores. This happened because 
the admissions committee did, in fact, look at many factors, includ- 
ing grades and extracurricular activities and whatever kinds of 
personal qualities we co^ld identify, and based on the entire pic- 
ture, we felt that the women being admitted were every bit as 
strong as the men, in spite of somewhat lower test scores, and in 
fact that has turned out to be true. So the research has not sui- 
prised us. 

Thank you, Mr. Chairman. 

[The statement or Michael C. Behnke, with attachment, follows:] 



ERIC 



285 



Testimony Presented to the Subcommittee on 
Civil nr.d Constitutional Rights. Apiil 23, 1587 



Michael C. Behnke 
Director of Admissions 
MIT 



I have been asked to discuss hov standardize<? tests are used in the 
admissions process at colleges vith competitive admissions and to describe 
rei^earch ve are doing at MIT on gender and testing. 

My experience has been \.hat standardized tests are used in a 
responsible manner in the process for vhich they vere designed, i.e. 
college admissions. A central question is vhether students are denied 
admission to the college of their choice because of test scores. The fact 
is that most students are admitted to thei^ first choice college. 
According to the Higher Education Research Institute, 7\X of freshoen are 
enrolled in their 1st choice college; 935£ are enrolled in their 1st or 2nd 
choice. 

There are relatively fev colleges vhich deny admission to many of 
their applicants. I have vorKed in admissions at three colleges - Amherst, 
Tufts and ;ilr - vhich accept fever than one-third of thei" applicants. In 
the'4.3 cases, many students are denied the opportunity of enrolling in their 
first choice college. The admission decision is based on many factors and 
rarely is dependent solely on test scores. 1 have subnitted vith ny 
written testimony a profile vh:ch describes our selection procedure. The 
process is a complex one and is highly subjective. No one has cone close 
to perfecting the art of human assessment. Ve can't measure motivation. 



ERIC 



286 



page 2 

curiosity, determination or other similar qualities, but evidence of those 
qualities can carry great weight. Academic achievement ie crucial, but ve 
are laced vith trying to understand the meaning of gravies and courses from 
thousands of secondary schools. Extracurricular acconiplishments are 
important, but ve nust decide whether a leadership position listed on the 
application aeans that soaeone jon a popularity contrst vith little follow 
through or really served as a positive force for change. 

Ihis complicated process of judging evidence operates under pressures 
from many quarters including aluoni, coaches, politicians and donors. In 
the midst o£ all this, tests provide a useful standardized measure. They 
are accepted by professionals as one more very inoerfect measure of 
potential and are used in combination vith all the other pieces of 
evidence. The result of this is a wide distribution of scores. At HIT 
last year, we had applicants who scored over 750 on the SAT - Hath. 

Since we admitted only 1750 applicants, we could have restricted ourselves 
to this group. Instead we admitted ^OX of them. Ve then admitted 2SX of 
those who scored between 700 and 74(0, 22X of those between 650 and 690, 19JI: 
of those between 600 and 640, lOX of those between 550 and 590, ?X of those 
between 500 and 540 and 3X of those below 500. 

By saying that tests are used responsibly in the admissions process, I 
don't mean to inply that there is a settled, widely agreed upon way in 
which tests are used. I oean two things. First, I mean that admissions 
officers rarely depend on test scores alone to deny soneone admission. 
Second, I mean that admission professionals periodically reexamine their 
use of test scores, as they do other criteria. This reexamination has led 
some schools to no longer require test scores or to modify their 
requirements. 



287 



page 3 



Admission professionals are troubled by test abuse. Testing is a 
topic of concerned debatt at aloost every meeting of admission 
professionals. [I sent tvo of my staff nembers to attend the recent 
meeting of Fair Test.) \e are troubled by scholarship agencies vhich use 
score cut-offs. We arc ' loubled by the use of score cut-offs for athletic 
eligibility. We are troubled by the use of college entrance exams in 
identifying "gifted" students in middle or even elementary school. When 
most of the students lalclled "gifted" are vhitc males, vhat message does 
this send to young vomeii and minorities? We vorry about people defining 
their vorth and potential in terms of test scores. This is especially 
troubling because of r?.ce and gender differences in scores. We vorry about 
the gcoving industry of test coaching schools feeding off people's 
anxieties. We vorry that ve may lose a useful piece of evidence in 
deciding college admissions because test abuse may lead to abandoning or 
veakcning tests. We hope that public scrutiny and pressuie and the efforts 
of testing agencies themselves vill lead to fever abuses and more public 
understand ing. 

In the meantime, there are tvo things vhlch college admission 
professionals can do. The first is to comnunicatc to the public hov tests 
are used. Our profile is an atteapt to shov students and counsellors the 
range of test scores so they do not focus on averages. The second is to do 
research. 

Vc have been doing research to examine the relationship bctvcen 
college grades and an acidemic prediction formula combining high ichool 
grades in math and scienr , rank in class if available, and SAT's and 
Achievement Test scores. The formula has a scale vhich extends from 1 to 
99. Je looked at the relationship betveen the index and grades for a 




ERIC 



288 



page ^ 

recent entoiing class, most of whom had graduated (HIT has a graduation 
rate vithin 5 years of approximately 852). The ncan academic index for men 
vho entered vas 68; for the vonen it vas 60. The lover average for uo-cr. 
reflects lover test scores. Specifically, it reflects statistically 
significant differences in Hath SAT, the Hath Achievement Test and the 
Science Achieveaent Test. 

The correlation of the index vith grade point average (GPA) in the 
senior year vas .47 for both the men and the vomen. This means that the 
index is a reasonable predictor of 8-term cumulav.ve average for both aen 
and vomen, i.e. lov scoring raon and lov scoring vomen tend to receive lover 
grades. But the index underpredicts the final GPA for the vomen at all 
levels . 

Because of the lover index, ve vould expect vomen to have a lover GPA. 
In fact, at the end of 8 terms, there vas no significant difference in CPA. 
It is also useful to note that vocien had a higher retention rate. 

Various theories have been advanced to explain the fact that 
standardized tests underpredict the acadenic performance of vomen. One is 
that the result may be due to differences in the selection of courses by 
vomen and men. We found no evidence of that at HIT. Ve believe the 
possibility needs to be explored that the tests do not really get at vhat 
female ability means. In the words of our Associate Director for Research, 
Dr. Elizabeth Johnson, "What ve are really suggesting is the notion ot 
multiple models of developed ability, perhaps even intelligence, vhich are 
culturally related and vhich if accurately dravn, vould predict equivalent 
success on real vorld outcome measures." 



ERIC 



289 



Sumraaty of Testimony Presented to the Subcommittee on 
Civil and Con<; 1 1 tu t lonal Right?, April 2'^, 1987 



Michael C. Behnke 
Director of Admissions; 
MIT 



I have been asked to discuss how standardized tests are used in the 
admissions process at colleges with competitive admissions and to describe 
research ve are doing at MIT on gendei and testing. 

ny tr/vpcric""* ha/; been that standardized tests are used in a 
responsible manner in the process for whirh they were designee, i.e. 
college adoissions. An admission decisior is ba^ed on many factors and 
rarely is dependent solely on test scores. The process is a complex t/ne 
and is highly subjective. The result of this is a vide distribution of 
test scores in 9ny entering class. 

Admission professionals are troubled by test abuse. Admission tpsls 
are being used for many purposef, other than college admission. We hope 
that public scrutiny and pressure and the efforts of testing agencies 
thcnselves will lead to fever abuses and mere public understanding. 

In the raeantime, there are two things vhich college admission 
professionals can do. The first is to coainunicate to the public hov tests 
are used. The second is to do research. 

We have been doing research to examine the relationship betveen 
college grades and an academic prediction formula combining high school 
grades in math and science, rank in class if avaUable, and SAT's and 
Achievement Test scores. We found that vonen had significantly lover 
scores on the Hath SAT, and the Math and Science Achievement tests. The 
index vas a reasonable predictor of 8-term cumulative grade point average 
(CPA) for both men and vomen. But the index underpredicted the GPA for 
vompn at all levels. Whereas we would have expected a lower GPA foi women, 
there was no significant difference between vomen and nen. 




ERIC 



290 



Micnael Bohnkc c<mc to MIT as DircUor ot Adnissions in Mav of 1985 
Fo. mm. vcnrs previous to that ho was Dean of Admissions at Tufts l',nvor- 
sitv At Aralierst College between 1971 .,nd 1976, he served a., Associate 
Dean of Admij;sions, Dean of Ficshmcn and a lecturer in A-nericJii Studies. 
1r. Behnko has tauRht in both public and private secondary schools, in the 
Ipuard Bound Program for low income students, and m the Peace Corps m 
Sierra loonn, We^^t Africa. He was also the Education Director for a 
community action afiency in Springfield, Massachusetts. 

Mr Bohnke received his undergriduato degree in American Studies 
froTi Amherst College and a Master of Arts degree in the sarre field froT, 
lenn. He presently serves as Chairman of the New hngland Regional Council 
of the College Board, Vlce-Cha irman of the National Advisory Committee on 
International Fducation, and as a mPnhor of the scholarship -.election 
cnmm.f^cc r,f uht national Merit Corporation and United Technolov-ies 
Corporation. 



Er|c 204 



291 



Massachusetts institute of Teciinology 

Offiro of Admlsalons • Room 3-108 • Cambridge, MA 02139 
• Phone (617) 253-4791 • Telex 92-1473 




MESSAGE FROM THE DIRECTOR 
A Different and Exciting Class 

Wc made some special ellorts this year to tell prospective 
applicants about the broad choices available at MIT both 
in the curriculum and in student life In response 
applications increased bySpcrcent including a ISpcrcent 
increase in women This larger pool allowed us to admit a 
more diverse group of students with the same academic 
talent necessary to succeed at MIT This year's class is 38 
percent women and 30 percent minority students (189 
Asian Americans. 61 Black Americans 23 Mexican Amen 
cans 17 Puerto Ricans and 4 Native Americans) We also 
had a decrease in the number ot students interested in 
Electrical Engineering (our targes! majOr)and an increase 
in interest in such fields as economics management 
political science and humanities 

A New Selection Procedure 

During the year we examined the selection procfKlure 
used at MiT lor many years While the procedure has 
served MIT well, we decided it was time tor a change We 
will continue to have applications evaluated by two 
people, wtth a third reader called in to resolve any 
significant differences Readers will continue to be drawn 
from the staff, faculty and administration But the ralings 
they give will bo changed instead of one academic rating 
based primarily on grades and test scores in math and 
science, applicants will receive two academic ratings One 
will be Similar to the old rating in that it will bean umencal 
summary of grades and test scores More attention, 
however, will be given to a students whole record rather 
than primarily the math and science record, and more 
attention will be given to the quality of courses taken A 
second academ'C rating will be completely subjective It 
will be a readei s impression o' an applicant's personal 
characteristics pertinent to academic promise We hope to 
recognize students who bring a special level ot excitement 
to the classroOTi or an unusual brilliance to their own 
studies or research 

There Will also be two personal ratings One will measurea 
Mudfentsav 'lal accomplishmentsand skills Thismaybe 
talent in mti<;ic or athletics expertise m a hobby leader 
ship or enfrepf.-neurship It ran niso stmpiy rccognizo 
that a student has been limited in this regard bv thp 
necessity to v ck long hours for pay The second rating 
will be a sijb|cctivc reaction to thp applicant s mdividijal 
style and sen<;e of purpose 

Wehopethatthissomewhat morecompiex procedure will 
allow the Admissions Committee to make dec.i>i&n<> based 
on more dimensions of the applicants We think the effect 
might be to place somewhat more weighi on grades and 
quality ol program as opposed to standardized testing and 



somewhat less weight on small dillerencos m objective 
measures in lavor of trying to recogni/e real love of 
learning and other special personal qualities 

Special Initiative for Underrepresented 
Minority Students 

Althougn the number ol black. Mexican American. Puerto 
Ricanand Native American students entering MIT is high 
compared to most schools, it has not increased m many 
years We arc concerned by the apparent drop in the 
number of minority students got ig on to college and by 
the increasing anxiety over high cost and over loans 

Wishing to make known ou r commitment to increasing the 
number of underrepresented minoritystudentsat MIT, we 
have developed a new service and combined it with 
several existing ones in something we call the Pathway to 
the Future Program The centerpiece is our new service. 
The Practical Experience Program We recognize that one 
of the greatest benefits MIT students en;oy is access to 
summer jobs which provide a high enough salary to 
significantly reduce loan burdens A new person in our 
Office of Ca reer Services will ieek out Sum mcr jobs for our 
underrepresented minonty students and counsel the 
students in how to qualify for those jobs The opportunities 




photo Garf'nkel The Tech 

wi'l be m a wide variety ol fields such as bus-ness 
planning engineering and bank, ng 

For minoritystudentsespecially interested in engineering 
we will continue our Second Summer Program This^tves 
MiT underrepresented minority students an opportunity 
to spend three ummer months at ihe end of their first 
year wotkini^ ki the design or research department of a 
mojot cuiporation A substantia! number of the partici- 
pants are able to netjutiate |Obs at the sane company in 
succeeding summers 



ERIC 



292 



MIT invites all underrepreser.tod minority stjdetits to 
Project {r.lerphasc which takes place dunng tnO summer 
before freshman year Students choose between a seven 
week session which mcludes freshman courses and a two 
week session with seminars about freshman courses and a 
focus on resources outside the classroom MIT provides 
room and board as well a* all rM>ce»$«ry texts and 
materials (or classes 

MlT aiS'' provides special financial support to uirter- 
represeiited minoiity students who need financial aid 
There is some evidence that many mif>onty femilies have 
to pay more for basic necessities Consequently, MIT 
reduces the usual Parental Contnbutionforlower-mcome 
minority families Further, .f minonty students fmd thejr 
academic program necessanly lengthened beyond the 
usual (our years to a bachelor's degree. MIT will provide 
financial atd, up to need, for a ninth or tenth term of study 

Best Wishes, 

Michael C Behnko 
Director of Adm»sions 





< 














\ ■' . 


— . ....L .« 



ERIC 



BEST COPY AVAILABLE 



293 



GENERAL STATISTICS ON THE CLASS 

Applications for Admission US international 

Final Ap^>!i(af-o) ^ ^*il3 /OCi 

f xppcte^j iy Ri>qisi.v 9?8 f>9 



Applicants for Early Action 

TheFarlv Action PrOqf3'P r'KiStSitnly for Citi/tinb dDj pt^ • .i 
nof *esiden*s of the US 

Edrly .iclton considf"-ation <>avO'1ab't» to applicant*; wh(^ rMv< 
completed the MIT appiicat onp'occssbyNovi'mtH-f i itsi 
S(,0'ps 'roni the November tpst'ng date Will bo accep'oO F.ariy 
A'^don .Tppt'cations wtti bf> reviewed by the end of D<K;i>mbpt 
Some oliprs of admiss'on Will be made other applications 
w'llbeheld without pto;i"jicL' torconstderattonittht- regulT 
titnf Applicants admit'ed m Oercnitxv need not ipfAv imt,! 
the Cand.datos f^epty Date m earl/ May 

Total 

Nt/finH*f Wro Af .plifri 1 0S6 

Number Afj-iiitted fc^riy 428 
Number Of^^^r ed arc" Admitted Later 76 



GEOGRAPHIC DISTRIBUTION 



Tt 1 ■ • 


t .^M,'S Uf-Jf. 


,Tt . r * 


.'sar>MH 






1 J -i imp, 




M>'» I tlUf 






H . 1 , 
















Percent oi 




Noni.*>erof 


Percent 


Number 


^ntpnng 


Region 


Applicants 




EnroJteO 


Ciass 


Npw t r 1 .) 




. H' 


It 1 


































Suutr O't, 




,'8 


'G 




^\e't 






1 r 




GuJf- 










P,;.'rt . 










Virqir K ,v 
















( 




C.U'ao.ii 










Tnt.l' 






9?^ 


'00 . 



CREDIT AT ENTRANCE FOR 
CLASS ENTERING SEPTEMBER 1985* 



Schools Represented 
in the Entering Class 

Ub Public Scr>L',vs 
US IndPpcndort ri 
Cburcf. Ret<it*>d 



Number ol 
Schools 



15") 
63 



Number ol 
biuaent^ 



185 

70 



RANK IN CLASS 

Although in rach ca«n the secondary school record is a 
Stqnificant predictor of college performance school standards 
arid tnarkirq systems vary so widely that average grades m 
school cannot b«» satisfactorily gencrahrpd here the rank 
in class data are lesi 'Effected by marking systems butlhcy 
Junut rMLC^gti./ediffprpiiceSin school standards Therefore 
buth rank iticidss and ihe pefcentdge of the r lass that goes 
^n !y ^ olit'ye ar*> considert^d Each sct>ool is urged lo explain 
t'OA u'rfss rank IS detpfmmed and how grades are affected 
by accete'dted enriched or non-grade programs 

Clas& rank for all applicants tor whom rsr^k was submitted 



SUid'Os oey*/nd tht> 'evpl cf th«> ir j-ji'iu -a' spLor.dary sch-j./i 
I hrru uljn ,\''*'^ re>. o^ni/pd tor rr*»r«it and wnu't dppfti^i('^ii» 
phL L>rr pnf tu ef>t''r'rg fir&t yf ar students 

Students Students 
Category Seeking Credit Receiving Credit 

Cr*'d'r by L ) 'eqr Bo-tO 



At'' Testb 
CrPd t Dy MIT ATvanrro 

Standing Lxan^ 
Credit bv ColifC;t' 

Transcript 
Credit from A LOvu' 

Ex.inrts and 1^e 

Interr^ational 

Baccalaureate 



G57 



5G2 



Rar^k m High 


Number ol 


Percent 


Number 


School Class 


Applicar^tt 


Admitted 


ir^ Class 


Top tenth o! 1. 1 iss 


4 1.>3 




803 


2nd tenth of ciciss 




7\ 


25 


3rd tenth )f dab's 




'i\ 


5 


4tStf-ntf\ of< la'-A 




2\ 


2 


5th tpnth of cnss 


^■^ 


3% 


1 


Lowet hal' nf c iiK-, 




0"^ 


0 




ERIC 



294 



COLLEGE BOARD TEST SCORES* 

This summarv of College Board scores mdtcates 1) the 
number of applicants m each interval who completed at! 
application procedures for admission to MIT m 1986 2) the 
percent who were actually offered admission and 3) the 
numbt'i expected to register m the class of l986 "^^^^e 
figures un this page include all applicaticns U S Canadian 
and intt'rnationdt on wtiich action was taken althoi^gh they 
mav omit a tew applicants not comt.ig directly from secondary 
schools Oui experience over many years reaffirms that 
Standardized tests are important 1) m comparing the 
achievemen' o} students from vaiious school systems and 
2) in predicttpg wfiich o< oui applicants are most likely to 
expenence acaoemtc suc"essat MIT As can be seen froi i 



•he figures no culutf scores of any sort are »sed For 
example r.ithough less ^nan half of the studen{«5 with 
fnathematits aptitude scores of over 700 were off red 
admission itotatuf 368studentswith lower mathematics o^T 
scores were admitted There are >} cuurse some practical 
limits below which there would be senous doub»^ about a 
student s anility to besoccessfutm the freshman year at M IT 
By examini'^g the table below it is possible lo get a rough 
estimate of ine probability of admission to MIT for a studert 
with given scores Overall we regard the tests as powerful 
•nstruments m oui search for talent Howevei each case is 
decided on its .ndivtduai rnents a score of 800 does not 
ensure admisstonnot does a scure below 600 ensure rejection 



SCHOLASTIC APTITUDE TEST ACHIEVEMENT TESTS 







Vtftil 






M<th 




Math Level 1 




l^th LevHll 






Rangf ot 


»of 


\ 




• of 




« in 


«of 




< n 


» ot 




» m 




Scores 


Applicants 


Ac)mitt»\j 


Class 


Applicants Acjmittod 


Class 


AppliCint^ 


AOni'tt.'O 


Cla>s 


AopttUints 


Admitted 


Class 


bcores 


75(K«0O 


154 


61 


?9 


1 913 


40 


400 


465 


43 


111 


? 193 


39 


463 


750800 


7CO-740 


669 


*6 


14? 


2 187 


28 


364 


771 


yi 


15' 


1 000 


?9 


169 


700-740 


650 690 


1 ?27 


37 


?3f, 


1 081 


77 


i5j 


7Z3 


?3 


90 


6..8 


24 


89 


650^ 


600^ 


1 388 


3? 


2(A 


533 


19 


55 




?T 


73 


271 


19 




600640 


550-590 


1058 


23 


158 


?01 


10 


!3 


247 


U 


18 


M5 


15 


11 


550590 


500 540 


708 


17 


9? 


97 


7 




133 


12 


V 


?0 


'0 


1 


50OS40 


(Wow 500 


874 


10 


75 


66 


3 




86 


1 


0 


13 


0 


0 


Beiow 500 



ACHIEVEMENT TESTS 



£r>ghth Conipo»itlon 

o*Hiito*y CheiTJrttry 8K>*ogy Phytic* 



Range of 


» of 


% 


# in 


»of 




» in 




\ 




- Of 


S 


« tn 


Range of 


Scores 


Applicants 


AcJmitlocJ 


Class 


Apptica 1 


ts Adniitttxj 


Ctjss 


AppliLa"'s 


AOmittt'O 


Ct iSs 


Apphiants 


AdmiHed 


Class 


Scores 


'50-eoo 


195 


63 


42 


512 


48 


i12 


93 


61 


20 


596 


35 


107 


750800 


700 740 


746 


50 


13C 


617 


39 


136 




49 


''9 


522 


32 


99 


700-740 


^>SO690 


1 201 


39 


25' 


611 


32 


115 


31? 


45 


83 


'^20 


28 


98 


650690 


60O640 


1 173 


30 


?06 


58'^ 


31 


107 


?\3 


^8 


59 


'Vi 


19 


56 


600 640 


550-590 


985 


2? 


141 


393 


?3 


56 


'56 


25 


.8 




21 


49 


550590 


50OM0 


926 


17 


96 


253 


16 


2f 


&4 


16 


1 1 


2J' 


20 


3t 


500540 


r 'low "ioo 


1 185 


9 


75 


224 


11 


14 


90 


9 


7 


153 


9 


7 


Beiow 500 




ERIC 



295 



Ml r»ibOf'; of this 



PARTICIPATION IN ACTIVITIES 

i<iss q,iinod a(3mtSSton ai loast n part un the stronqtn of ti.utiLipatinn m actw.ti.-s suth .v, the foiinwtng 



Activity 

E'ected school or c!,is>s officer 

V.irSit/ sport partiCtpS'it 

Particip.int m school publicitions 

Officer in school c lub of orgarii/ation 

Officer mciviC community or rcngious group 

Participant If) school or community music group 

Participant in dramri debate or da rice 

Part tinie work during school 

f oil or Part-limo worK during the sumn»cr 

In addition to the above students participated »n many other untque activities 



Numtof 
in Class 

187 
386 
469 
397 
340 
311 
204 
542 
513 




photo Campbell 



ERIC 



2: 



296 




297 



Mr. Edwards. Thank you, Mr. Behnke. 

We are now going to hear from Dr. Denise Carty-Bennia, who is 
Professor of Law at Northeastern Universit>, and Executive Chair 
of Fair Test in Boston. 

STAT^IIIIENT OF DENISH CARTY-BENNIA 
Dr. Carty-Bennia. Thank you. 

Meaningful access to higher education for racial and ethnic mi- 
norities in the country has become stym'ed almost as soon as it has 
begun at predominantly white colleges and universities. 

This aspect of the American dream realistically can no longer be 
viewed as simply deferred. In an increasing number of instances, it 
is being denied. Today, blacks have a smaller presence on Ameri- 
can campuses than they did six years ago, both in absolute num- 
bers and as a percentage of all undergraduatps. The enrollment of 
Hispanic students also lags far behind their overall representation 
in the population. 

This situation is most notable in both public and private four- 
year colleges and universities, as well as graduate schools. In fact, 
most minorities in higher education are concentrated dispropor- 
tionately in two-year community colleges, with little or no real pos- 
sibility that they will transfer to a 4-year institution. 

Standardized tests exacerbate this problem. While a numbe. of 
complex educational, political and social factors contribute to the 
limited access of minorities to four-year colleges and universities as 
well as graduate schools, standardized tests continue to be a major 
factor because of their central and, in fact, growing role in admis- 
sions decisions Many colleges and universities throughout the 
country are now placing greater reliance on standardized test 
scores in an effort to so-called **upgrade" their academic standards. 

In tact, another recent study of undergraduate admissions by the 
American Association of Collegiate Registrars and Admissions Offi- 
cers, and the College Board, found that 39 percent of the public and 
42 percent of the private 4-year postsecondary institutions set mini- 
mam SAT scores for admission, and that approximately one-third 
of all 4-year institutbns sec minimum ACT scores for admission. 

These practices chould be of great concern to all uf us. Automati- 
cally rejecting an applicant because he or she did not obtain a min- 
imum score is among the most blatant misuses of standardized ad- 
mission t^ams. I will return to that later on. 

Test scores systematically hamper the opportunity of minorities 
to gain admission to American colleges and universities. Racial and 
ethnic minorities perform poorly on standardized college and ad- 
missions tests Uve -emphasis on these tests significantly reduces 
the opportunity of minorities to gain access. I have here a table 
that I will share with you in my actual report, but it indicates that 
this is, in fact, supported. Even when family income is considered 
along with race, it is clear that racial and ethnic minorities do not 
perform as well as whites at the same economic level on these 
standardized exams. 

Standardized tests are often biased against racial and ethnic mi- 
norities The 1979 New York tru.h-in-testing law forced test pub- 
lishers to make public all copies of the university adn.issions tests. 



ERIC 



30i 



298 



Their tests recently examined 15 scholastic aptitude tests. Many of 
the questions in these tests required students to be familiar with 
the activities and the vocabulary of upper middle-class suburban 
America. Things such as golfing, tennis, pirouettes, property taxes, 
kettledrums, minuets, melodoons, timpanists, polo and horseback 
riding were all mentioned on these tests. Students not familiar 
with these culturally specific activities could not have obtained the 
high SAT scores needed to enter America's selective colleges and 
universities. Nor could they have received Inancial aid awards 
from both private as well as governmental agencies. 

One example of an SAT question— and I have included several in 
my report for you— is "melodeon is to organist what reveille is to 
bugler, solo to accompanist, crescendo to pianist, anthem to choir- 
master, and kettledrum to timpanist.'* I would not have know the 
answer to this question. It is, in fact, that melodeon is to organist 
what kettledrum is to timpanist. 

Testing for the Public, a nonprofit organisation which trains mi- 
nority students to take standardized tests, recently examined the 
law school admissions test, the LSAT, using information also made 
public through New Yorks truth-in-testIng law. Their research 
identified many items which contained degrogatory references to 
prominent minority figures such as W.E.B. Du Bois, Cesar Chavez, 
and Harriet Tubman. 

They also found numerous items which are extremely offensive 
to minority test takers. For example, the following recently admin- 
istered LSAT items were supposed to measure a student s knowl- 
edge of grammar: 

^'Afrikaans is the language of the ruling party in South Africa, of 
the Afrikaners, whose votes maintain the status quo." No error. 

**The Supreme Court ruled that it is not inherently unconstitu- 
tional for a white suburb to refuse to change zoning rules which 
practical effect was to block construction of racially-integrated 
housing." 

Students usually have less than a minute to answer each item on 
most college and professicnal school multiple-choice exams. David 
White, Executive Director of Testing for the Public, pointb out that 
such questions often c !se minority students to get angry or to 
waste time thinking about the contents of such questions. 

A 1980 study by Joseph Gannon for the National Conference of 
Black Lawyers provides further evidence of bias in the LSAT. The 
large gap between the median LSAT scores of blacks and whites 
historically has been explained away by test publishers as the 
result of unequal educational opportunity. Grannon's study took 
care to eliminate the possibility of lower academic ability on the 
part of minority students as an explanation for his findings. He ex- 
amined the difference in the LSAT scores of black and white col- 
lege seniors, from the same universities, and who had earned com- 
parable undergraduate grade point averages. Gannon*s finding 
iihowed that blacks, with the same grades, from the same colleges 
as whites, scored more than 100 points lower on the LSAT. In fact, 
this has been my personal experience with over 10 years worth of 
law school admissions work and, in particular, work on minority 
subcommittee admissions process at Northeastern. 



ERIC 



290 



Coaching has been talked about earher today. I maintain that 
this places minority students in what we call a case of double jeop- 
ardy. As the use of the test has increased, a parallel phenomenon 
has developed— coaching for the test. The SAT alone has spawned 
a thriving multi-million dollar industry in preparing students to 
take the test, not to mention the LSAT preparation industry. 

Preparaton ranges from private tutors, who charge prices up to 
$60 per hour, to the coaching schools, like the Stanley H. Kaplan 
Educational Centers and the Princeton Review, which charge tui- 
tion in the $400 to $600 per course range. Both Kaplan and the 
Princeton Review claim average SAT score improvements in the 
150 to 200 point range. Such an increase can mean the difference 
between a rejection notice and college admission with a scholar- 
ship. 

This coaching boom, however, puts most minority and low 
income students in double jeopardy. Not only are they unable to 
afford the advantages promised by coaching, but the success of 
coaching increases the disparity in performance between racial 
groups even further. 

Standardized tests are poor indicators of prospective minority 
students performance. The problems caused by racially disparate 
scores and standardized test bias are magnified by overreliance on 
standardized test score admissions. Tests have been oversold and 
overused to the detriment of other, often better predictors of per- 
formance. Many studies, including one conducted in 1985 by Dr. 
Peter Garcia, Dean of Education at Pan American University in 
Texas, for the National Institute of Education, have concluded that 
standardized tests have no predictive ability for future perform- 
ance. Charles Willy, a professor at the Harvard Graduate School of 
Education, has reported that at Ivy League colleges there is no cor- 
relation between admissions test scores and the academic perform- 
ance of minority students by their fourth year. Many of these mi- 
nority students have had to make significant adjustments to the 
college environment. 

Professor Willy also found that when graduate school admissions 
committees ignore applicant scores on standardized tests, these 
committees tend to admit a higher proportion of minority students 
than they do when test scores are made a part of the admissions 
decision. At the undergraduage level, Willy continues, evidence 
also exists that the use of scores on standardized aptitude tests as 
part of the admissions process disproportionately excludes some 
racial and ethnic minorities. 

In 1977, when the University of California proposed a change in 
its admissions policy that would give greater weight to standard- 
ized test scores than to high school grade point averages, a member 
of the Board of Regents requested that the new criterion be applied 
hypothetically to the class that had been admitted a year before 
the new policy was proposed in order to assess the potential effects 
of the change. The study revealed that if the proposed admissions 
criteria had been in effect a year earlier, the total Un'.versity of 
California student body would have included 9.5 percent fewer His- 
panic students and 8.8 percent fewer black students. 

Tests are very inaccurate at even what they purport to measure. 
In the College Board publication ''The Admissions Testir g Program 




300 

Guide for High Schools and Colleges", the Board provides even 
more detail o.i the limitations of the scores. The guide urges that 
SAT scores be interpreted as ranges rather than as points. The 
ATP guide refers to the fact that the score an individual receives 
on one administration of the test is probably not the person's true 
score for an exact measure of that person's ability. 

They then go on to speak about what we call normal variations 
in the score than an individual receives when they are tested on 
the same or similar test at different times. This means that for 
most— that is to say, two-thirds— of the mdividuals taking the SAT, 
both the verbal and mathematical score obtained will be within 30 
points of their so-called true score. If, for example, a student's true 
verbal score is 450, then the actual score of the student will be be- 
tween 420 and 480. 

The same holds true when we're making distinctions between 
two persons with respect to their scores. This is called the standard 
error of deviation or difference. The Board, thus, advises colleges 
that between students' score differences of less than 66 points and 
72 points on the SAT verbal and SAT mathematical, respectively, 
have little significance. Yet schools often make important judg- 
ments on the basis of as little as 10 point differences in the SAT 
performance. By the way, the same holds true, and is actually 
more true, for the LSAT. Now that it's rated on a 10 to 48 scale, 
the kind of distinctions that we make between students are be- 
tween scores of 36 and 34, tv/o point differences with respect to 
using the LSAT. 

This overemphasis on test scores obviousl> underemphasizes 
other and what we would maintain are probabl> more valid fac- 
tors, such as grades, extracurricular activities, writing and creative 
ability. 

Standardized tests influence the awarding of scholarship money, 
which further hampers the ability of minorities to pay for a college 
education. Over $100 million in merit scholarships are directly in- 
fluenced by biased standardized tests. Scholarship agencies' reli- 
ance on test scores is often the key criteria that keeps minority stu- 
dents from obtaining needed financial aid awards. The National 
Merit Scholarship Program, foi example, automatically rejects stu- 
dents who do not achieve scores in the top one-half or 1 percent in 
that State from consideration for the $23 million in scholarships 
awarded annually. 

Last year, all recipients of college scholarships awarded by the 
State of Alabama were white, even though the public school system 
is nearly 50 percent minority. Information made public through a 
class action lawsuit revealed that the State's heavy reliance on the 
ACT scores was responsible for no minority students receiving 
scholarship assistance. In addition, many colleges and universities 
scholarship programs award on the basis of standardized test 
scores. 

Minority female high school students are doubI> penalized by 
both the gender and racial biases of the SAT. In every category, 
males outscore females. This is across all ethnic and racial catego- 
ries. The New York Empire and Regents Scholarships, worth over 
$40 million last year, are awarded exclusively on the basis of the 
student's ACT or SAT score. In 1986-87, this reliance resulted in 72 



301 



percent of the winners of the New York Empire Scholarships, 
worth up to $10,000 each, going to males, and just 28 percent going 
to females. Bear in mind that's obviou^^ly compoanded then by the 
racial discrimination, which we must overlay on these statistics. 

Counselors also often rely on test scores as the basis for recom- 
mending college and university, not to mention graduate school. 
My own experience is that minority students who present them- 
selves at pre-law advisors' offices at college and university campus- 
es are regularly discouraged from applying tu graduate school pro- 
grams, and in particular, law school, on the basis of their low per- 
formance on the standardized LSAT exam. 

In addition, it is fairly clear that .students themselves are self-se- 
lecting about the schools that they apply co. I wanted to respond to 
my colleagues on the panel's remarks earlier, around the statistics 
of those who get into their number one choice of college or univer- 
sity. That has to be viewed against the backdrop that most stu- 
dents readjust the colleges and universities that they apply to, 
much less the graduate schools that they apply to, on the basis of 
receipt of their standardized test score. If it is lower, they then 
place themselves almost automatically opting out of applying to 
certain "more selective" schools. 

Recommendations. It seems to me, at the very least, we ought to 
be asking the test companies to enforce their own guidelines. If the 
College Board guidelines on the uses of College Board test scores 
and related data emphasize that the test score should not be the 
sole factor in determining the admission of an applicant, then the 
College Board ought to make sure that the recipients of their Col- 
lef^^ Board scores are, in fact, not only informed, but, in fact, living 
-4 to that practice. The College Board continues to provide test 
scores to colleges and scholarship agencies that automatically deny 
consideration to students who fail to achieve a certain minimum 
cut-off score on the PSAT or the SAT. By the way, that's also true 
for most law schools in this country. They use a minimum LSAT 
cut-off score. There are two large drawers in a file cabinet in the 
admissions office at Northeastern University which will not be 
looked at by the admissions committee because they do not meet a 
minimum LSAT score. 

Similarly, in a statement regarding tests and standards, ETS 
president Gregory Anrig has stated that admissions test results are 
supplementary to the academic record and other information about 
applicants. Test scores should be used in combination with other 
information and not as the sole basis for important decisions affect- 
ing the lives of individuals. 

Mr. Edwards. If I may interrupt at that point, what do you 
think would happen at Northeastern if the admissions office ac- 
cepted a bunch of those applications where the scores were very 
low? 

Dr. Carty-Bennia. Probably not a whole lot. 

Mr. Edwards. What would happen to those students? 

Dr. Carty-Bennia. I think they would probably graduate from 
law school and go on to successfully pass the bar and practice law. 
I m not suggesting that exceptions are not made. In point of fact, 
we all know that exceptions are made for students with low stand- 
ardized test scores. But the basis of those exceptions tends to be on 



ERLC 



3 1; 5 



302 



the basis of family connections, alumni connections to the school, 
and other forms of admitting students who have low standardized 
test scores. They do manage to graduate and they do manage to 
pass the bar and they do manage to go on and practice law effec- 
tively. 

The problem, of cours., is that their interest groups are repre- 
sented in the admissions process, and we're suggesting that that 
works in a discriminatory way against thube who do not have their 
interest group representative in the office to make that exception 
for them, with respect to standardized test scores. 

The second thing, it seems to me, is that we ought to require that 
standardized admission tests be made as fair as possible. The 
Golden Rule bias reduction technique is a safeguard which test 
companies should employ to ensure that their tests measure rele- 
vant knowledge differences between test takers and not irrelevant, 
culturally specific information. It is based on a November, 1984 
out-of-court settlement agreement between the Educatioiial Testing 
Service, the State of Illinois, and the Golden Rule Insurance Co. of 
Lawrenceville, IL. ETS agreed to employ this new procedure in 
order to settle a lawsuit charging that their Illinois insurance li- 
censing exam unfairly discriminated against blacks. 

In 1986, ETS extended the Golden Rule reforms to its uniform 
insurance exam that is annually administered to over 250,000 job 
applicants in 22 States and Bermuda. The Golden Rule reform 
makes exams fairer, not easier. Under the procedure, the same con- 
lent areas are covered as on previous tests, and the exams are of 
the same level of overall difficulty. The only difference is that 
within groups of equally difficult items, in the same content areas, 
test publishers must select those items that display the least differ- 
ence in the correct answer rates between majority and minority 
test takers. 

While not part of the Golden Rule settlement, the procedure 
lends itself to the elimiuoiion of gender bias in standardized test 
taking as well. Application of the Golden Rule bias reduction fo; 
race and gender bias should be required for every higher education 
standardized admissions test. 

Finally, it seems to me that we ought to talk about opening up 
the SAT for competitive bid. For the past 40 years, the College 
Board has simply renewed the contract awarding ETS the right to 
develop the SAT and related products. Internal ETS documents 
reveal that ETS earns a profit of about 30 percent from its Colle^^e 
Board-related activities. If ETS had competition from this lucrative 
$50 million a year contract, the company wouW have to eithei 
become more responsive to the concerns noted, or face the very 
real possibility that a more innovative and responsive company 
would be awarded Ihe contract. 

Finally, I think that we ought to bring to this committee's atten- 
tion the fact that I just was informed there are certain Federal 
Government scholarships awarded on the basis of standardized test 
scores, the Byrd scholai ships, which I would ask that this commit- 
tee at least take the very first steps of looking into and perhaps 
looking towards eliminating ur at least minimizing the impact of 
standardized tests in the award of those scholarships. 

Thank you. 



ERIC 



303 

Mr. Edwards, Those are good recommendations. Thank you very 
much. 

We're only going to have about five minutes. Why don't you go 
ahead, Ms. LeRoy. 

Ms LeRoy. Mr. Behnke. the subcommittee has heard a lot of tes- 
timony today about different types of tests and test uses, but most 
of the testimony has focused on the SAT as a college admissions 
test. That already is a small universe, I suspect, in the realm of 
testing and is, I suppose, a reflection of our own cultural bias as to 
people in this room as much as anything else. But MIT represents 
an even more rarified atmosphere, 1 suppose, even in the world of 
college admissions. 

Do you think that the sort of process that your institution has 
initiated with respect to college admissions is transferrable to other 
institutions that are larger, less selective, less elite, I suppose, and 
can be used with the same kind of reliability as the process that 
you have instituted at MIT? 

Mr. Behnke. Well, the quick answer is yes. It's a matter of how 
many resources the institution is willing to commit to the process 
of admissions. I think people in the profession want to do as thor- 
ough job as possible and they are limited in many cases by staff 
and other kinds of support. I think as much as possible, decisions 
on how to allocate resources like a college education should be 
based on as much information about the individual as possible— the 
meaning of the grade, the meaning of the secondary schools they 
come from, the quality of courses they're in, which is something we 
look at very carefully. Then all the kmds of activities that a person 
takes part in. That takes some sensitivity and some time in reading 
applications and getting to know communities, because as \ men- 
ti9ned earlier, I think one of the real problems is the sophistication 
with which different kinds of people present themselves. 

We would very much like to depend more heavily on things like 
what a student has actually done, what their characteristics might 
actually be But a stuJ.iit from an affluent suburban high school 
or prep school gets recommendations and advice on how to present 
him or herself substantially different from yomeone in a small 
farm community. We're always trying to read into applicants' situ- 
ations what their context is. 

So to go beyond the objective evidence is a time-consuming proc- 
ess and demands resources. Ideally, I think every institution ought 
to be doing that. 

Ms LeRoy. But it is possible to do if they're committed to devot- 
ing the time and the resources? 
Mr. Behnke. Yes. 
Ms. LeRoy. Thank you. 

Mr. Edwards. Dr. Carty-Bennia,^ what do >ou say to th^ response 
of n previous witness, that the make-up of the female population of 
applicants has much to do with their doing poorer in these tests 
than they did previously, that there more of them, that they come 
from poorer families, with less education and so forth? 

Dr. Carty-Bennia. WelK at least with respect to the LSAT, that 
isn't true. We still cream, if you will, the ''creme de la creme" of 
students coming from undergrauuate institutions who sit to take 



304 



the LSAT. In fact, what the LSAT does is to underpredict how suc- 
cessful they will be academically in law school. But that's also con- 
sistent with the fact that these have been "superstars", if you will, 
in terms of their academic record in college as well as high school. 

Mr. Edwards. One quick question of Mr. Behnke. What happens 
to those applicants that you accept at MF who really have very 
low scores? 

Mr. Behnke. In most cases students with low scores also have 
low grades, so that looking a all the evidence together, they are 
not admitted. If the evidence for some reason doesn't match, we in 
most cases go beyond the evidence. I have gone so far as to call sec- 
ondary schools and asked them to get teachers out of class to come 
and talk to me about the student's performance. 

That's why the process is very time-consuming, if youVe going to 
go beyond the evidence, and if weVe convinced that the test scores 
don t test that individuals potential, then we ignore them. 

Mr. Edwards. Counsel for the minority. 

Mr, Slobodin. First of all, let me just say for the record that I 
did not get a copy of Professor Carty-Bennia s testimony. 

Dr. Carty-Bennia. It will bo submitted 

Mr Slobodin. It makes it a little difficult to take a look at some 
of the statements that you m; ie, although a lot of them have been 
lifted, I think, from the repor; that Miss Robser released last week. 

You said that instead of st ndardized tests we ought to look at 
alternatives like grades and ( dracurricular activities. I wanted to 
ask you, what is inherently t jperior in looking at those items as 
opposed to the standardized t. sts? Specifically, I would like you to 
explain why you think a mu tiplc-choice test in a math class in 
u c^A^rn^^' inherently lessr bias-free than the math section on 
the bAr, and how you can ay there is less bias in evaluating 
whether a running back in coi ege, an award-winning running back 
or someone who pla>ed in k symphony orchestra, how can you 
evaluate— How can you compare apples and oranges and how is 
that less bias-free than the standardized tests? 

Dr. Carty-Bennia First of all, you asked me two questions, so 
let me try to address the latter part first. 

With respect to the way in which law schools can evaluate a run- 
ning back, if you will, as opposed to someone who plays in an or- 
chestra, it seems to me that schools make very up-front policy deci- 
sions about the kind of diversity that they want to see in terms of 
t^ composition of classes at t leir law school. regularly get ap- 
plications from a wide variet/ of people who do not translate, if 
you will, into being compan ble oranges or comparable applies. 
Ihey are, in fact, apples and )ranges. But we are very desirous of 
having not only apples and c ranges but bananas and cherries as 
well. 

Mr. Slobodin. You're starting off with a result. You want a cer- 
tain percentage right at the start. Suppose there aren't enough ba- 
nanas coming in or 

Dr Carty-Bennia. It s an imperfect world that we live in. In any 
given class, we probably won't have the appropriate percentage of 
the target percentage that we were thinking about in any given 
year. But we strive, out of the people that we get, with any ^^rtain 
comparable pool, to get the best of that pool. And so we will look at 



308 



305 



all of the running backs or comparable athletes. We will look at all 
of the comparable orchestra players, and we will look at all of the 
persons that are from rural areas to some extent and try to get the 
best out of each of those groups for a mix. 
Mr. Slobodin. Well, you're explaining the process but you're not 

telling me why there is less bias in that process 

Mr. Edwards. Tm sorry to interrupt, but we have to 

Dr. Carty-Bennia. Because they'll be compared on proven past 
record. 

Mr. Edwards. I hate being unfair to our witnesses. You really 
were entitled to a lot more time. But we did run out of time. We've 
got a supplemental on the floor and there's a vote right now. If I 
just get over there, Tm sure that education money will be saved. So 
thank you very much. 

Dr. Carty-Bennia. Thank you. 

[Whereupon, at 12:45 p.m.. the subcommittee was adjourned.] 




3.1 



